导入速度主要受什么影响?如果去除各种杂乱的html标签,以纯文本形式导入会不会快一些
可以在台式机上生成词典向量数据,然后拷到笔记本上用么?可以的话要拷哪些数据?
功能太强大了,没有之一。要是有视频教程就好了。有的功能没看明白,没用起来。
问题还是出在最后一步。
2024-09-01 14:25:01.268 [info] Building wheels for collected packages: kenlm
2024-09-01 14:25:01.271 [info] Building wheel for kenlm (pyproject.toml): started
2024-09-01 14:25:04.424 [info] Building wheel for kenlm (pyproject.toml): finished with status ‘error’
2024-09-01 14:25:04.433 [error] error
2024-09-01 14:25:04.434 [error] : subprocess-exited-with-error
Building wheel for kenlm (pyproject.toml) did not run successfully.
exit code: 1
[86 lines of output]
‘bash’ �����ڲ����ⲿ���Ҳ���ǿ����еij���
�����������
‘bash’ �����ڲ����ⲿ���Ҳ���ǿ����еij���
�����������
‘bash’ �����ڲ����ⲿ���Ҳ���ǿ����еij���
�����������
Will build with KenLM max_order set to 6
running bdist_wheel
running build
running build_ext
– Building for: NMake Makefiles
CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
CMake Error at CMakeLists.txt:14 (project):
Generator
NMake Makefiles
does not support platform specification, but platform
x64
was specified.
CMake Error: CMAKE_C_COMPILER not set, after EnableLanguage
CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
– Configuring incomplete, errors occurred!
Traceback (most recent call last):
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\pip_vendor\pyproject_hooks_in_process_in_process.py”, line 353, in
main()
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\pip_vendor\pyproject_hooks_in_process_in_process.py”, line 335, in main
json_out[‘return_val’] = hook(**hook_input[‘kwargs’])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\pip_vendor\pyproject_hooks_in_process_in_process.py”, line 251, in build_wheel
return _build_backend().build_wheel(wheel_directory, config_settings,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\setuptools\build_meta.py”, line 421, in build_wheel
return self._build_with_temp_dir(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\setuptools\build_meta.py”, line 403, in build_with_temp_dir
self.run_setup()
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\setuptools\build_meta.py”, line 503, in run_setup
super().run_setup(setup_script=setup_script)
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\setuptools\build_meta.py”, line 318, in run_setup
exec(code, locals())
File “”, line 124, in
File "f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\setuptools_init.py", line 117, in setup
return distutils.core.setup(**attrs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\setuptools_distutils\core.py”, line 184, in setup
return run_commands(dist)
^^^^^^^^^^^^^^^^^^
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\setuptools_distutils\core.py”, line 200, in run_commands
dist.run_commands()
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\setuptools_distutils\dist.py”, line 953, in run_commands
self.run_command(cmd)
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\setuptools\dist.py”, line 950, in run_command
super().run_command(command)
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\setuptools_distutils\dist.py”, line 972, in run_command
cmd_obj.run()
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\setuptools\command\bdist_wheel.py”, line 384, in run
self.run_command(“build”)
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\setuptools_distutils\cmd.py”, line 316, in run_command
2024-09-01 14:25:04.434 [error]
self.distribution.run_command(command)
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\setuptools\dist.py”, line 950, in run_command
super().run_command(command)
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\setuptools_distutils\dist.py”, line 972, in run_command
cmd_obj.run()
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\setuptools_distutils\command\build.py”, line 135, in run
self.run_command(cmd_name)
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\setuptools_distutils\cmd.py”, line 316, in run_command
self.distribution.run_command(command)
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\setuptools\dist.py”, line 950, in run_command
super().run_command(command)
File “f:\�ʵ��ѯ\python-addon\env\Lib\site-packages\setuptools_distutils\dist.py”, line 972, in run_command
cmd_obj.run()
File “”, line 104, in run
File “subprocess.py”, line 413, in check_call
subprocess.CalledProcessError: Command ‘[‘cmake’, ‘C:\Users\Dvid\AppData\Local\Temp\pip-install-wks71qlw\kenlm_579165bae9924e12b835bcfa3a433292’, ‘-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=C:\Users\Dvid\AppData\Local\Temp\pip-install-wks71qlw\kenlm_579165bae9924e12b835bcfa3a433292\build\lib.win-amd64-cpython-312’, ‘-DBUILD_SHARED_LIBS=ON’, ‘-DBUILD_PYTHON_STANDALONE=ON’, ‘-DKENLM_MAX_ORDER=6’, ‘-DCMAKE_WINDOWS_EXPORT_ALL_SYMBOLS=ON’, ‘-DCMAKE_RUNTIME_OUTPUT_DIRECTORY_RELEASE=C:\Users\Dvid\AppData\Local\Temp\pip-install-wks71qlw\kenlm_579165bae9924e12b835bcfa3a433292\build\lib.win-amd64-cpython-312’, ‘-DCMAKE_LIBRARY_OUTPUT_DIRECTORY_RELEASE=C:\Users\Dvid\AppData\Local\Temp\pip-install-wks71qlw\kenlm_579165bae9924e12b835bcfa3a433292\build\lib.win-amd64-cpython-312’, ‘-DCMAKE_ARCHIVE_OUTPUT_DIRECTORY_RELEASE=C:\Users\Dvid\AppData\Local\Temp\pip-install-wks71qlw\kenlm_579165bae9924e12b835bcfa3a433292\build\lib.win-amd64-cpython-312’, ‘-A’, ‘x64’]’ returned non-zero exit status 1.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
2024-09-01 14:25:04.436 [error] ERROR: Failed building wheel for kenlm
2024-09-01 14:25:04.437 [info] Failed to build kenlm
2024-09-01 14:25:04.763 [error] ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (kenlm)
2024-09-01 14:25:08.221 [info] 安装完成
2024-09-01 14:25:08.599 [info] ����
2024-09-01 14:25:08.599 [error] Traceback (most recent call last):
File “app.py”, line 18, in
File “error_correction/error_correction.py”, line 1, in
2024-09-01 14:25:08.600 [error] ModuleNotFoundError: No module named ‘sympy’
收到。这样的话文本纠错功能暂时用不了,不影响其他功能。我找个环境测试下。看看是哪个配置的问题
转换为向量时会把html标签都去掉。转换为纯文本,同时~会替换为当前的词名字
不过归根到底还是因为使用cpu的原因,等我改成gpu就能快个几倍(前提是显卡、集显不能太慢。。)
文本到向量的transform模型无法下载成功,奈何啊?很想尝试,望楼主指津!
卡在了第二步上,始终无法下载transformers,弄了一天了,不知问题出在哪里
不同的文件夹有不同的配置呀?每个文件夹都必须重新下载都有的配置么?
另,注意:导入txt文档时,UTF-8支持,GB2312编码则不支持,对不支持的编码,表现为txt无法导入。是否可以改进一下,支持更多的编码呢?
编辑时,引号显示为蓝色,这个很好。是否可以自定义其他字符串颜色呢?
想想我自己的amd580显卡,就各种活蹦乱跳的。。。,提gpu,搞过的,都领教过老黄的各种酸爽。。。
而且,算是建议吧,各路好汉如果,打算奔着众人拾柴火焰高的路子走的话,建议论坛开个专门的公共上传区,自己导入的啥字典,对应生成的啥索引文件的,都上传到一个可以公共下载的地方,然后,谁想用的话,自己直接通过程序的界面导入,省得同样的字典,不同人在不同的机器上用,没用一次,就得当一次黄牛。。。
兄弟,不要点直接下载优先,直接下载的意思是 不通过镜像网站直接下载,正常情况下国内没法下载,所以我加了一层镜像,但是又怕以后镜像失效了,所以加了一个切换。你如果要直接下载需要有专门联网的软件。
不同文件夹可以设置不同的,也可以设置全局参数共享。引号可以改其他颜色。gb2312也可以,右下角好像可以转编码。这两天在解决下载和一些依赖安装问题,所以有挺多东西待说明。
这个可以,但是又不太可以。怎么说呢,文本转向量是一个翻译的过程,向量模型就相当于翻译官,我现在使用的默认模型翻译出的和你换一个模型翻译的结果是不一样的,如果所有人都用一个模型翻译自然是可以的,但是万一以后有更好的模型。那么就需要重新导入
兄弟们,感谢各位的关注和热情。等我解决完基础问题,让大家都能正常使用功能后会单独开一个帖子集中进行一些解惑。现在暂时集中精力改配置了。(环境确实有点问题,部分用户在电脑中提前安装了依赖,所以不会卡,部分没装就卡住失败了。正在修改)
简单的归档的问题,啥模型下,给啥下载就是了。如果能开个公共上传区,建议,在工具建档完毕后,自动提示一下,是否上传到公共区,在建档开始的时候,自动提示,公共区有现成的归档文件,是自己重新生成还是直接下载。多个设置层面的选项罢了。这些需求说白了,还是看生态,生态茂盛的话,都不算啥。
大部分问题已经解决,安装、下载失败的坛友可以来这里查看。有安装视频解答[生花笔]安装教程(ocr,对话,智能搜索,文本纠错)