使用当前的 spacy 版本重新训练自定义语言模型(兼容性问题)
我已经安装了带有两种语言模型的 spacy: vi_spacy (不是我提供的),这是来自 的 spacy 自定义越南语语言模型这里的 Github 和日本的 ja_core_news_trf 模型。 我首先使用 python -m spacy download ja_core_news_trf 安装了 ja_core_news_trf 模型 在 anaconda 命令行中运行命令,它工作没有问题。然后,当我在命令行中使用 vi_spacy 安装并尝试它时,它起作用了。但当我尝试时,日本模型不再起作用了。
每次我收到此错误:
OSError: [E050] Can't find model 'ja_core_news_trf'. It doesn't seem to be a Python package or a valid path to a data directory.
即使当我输入 pip list 命令时,ja_core_news_trf 已安装。 经过调查,我发现 vi_spacy 仅适用于 spacy v3.0.8,但 ja_core_news_trf 需要 spaCy >=3.2.0,<3.3.0 并且与当前版本不兼容。输入 python -m spacy info 后,我收到此错误消息:
UserWarning: [W095] Model 'ja_core_news_trf' (3.2.0) requires spaCy >=3.2.0,<3.3.0 and is incompatible with the current version (3.0.8). This may lead to unexpected results or runtime errors. To resolve this, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate
运行 python -m spacy validate 后,我收到此消息:
ja_core_news_trf >=3.2.0,<3.3.0 3.2.0 --> n/a
vi_core_news_lg >=3.0.5,<3.1.0 0.0.1 ✔
The following packages are custom spaCy pipelines or not available for spaCy
v3.0.8:
ja_core_news_trf
那么我的问题如何使用当前的 spaCy 版本重新训练自定义越南模型?当然,我尝试联系开发商,但他没有回复,所以我想自己做,这是可能的。
i have installed spacy with two language models: vi_spacy (not from me), which is a custom Vietnamese language model for spacy from Github here and the japanese ja_core_news_trf model.
I first installed the ja_core_news_trf model with the python -m spacy download ja_core_news_trf
command in anaconda command line and it worked without a problem. Then when i installed vi_spacy using in the command line and trying it out it worked. But when i tried the japanese model didn't work anymore.
Each time i get this error:
OSError: [E050] Can't find model 'ja_core_news_trf'. It doesn't seem to be a Python package or a valid path to a data directory.
even though when i type pip list command ja_core_news_trf is installed.
After investigating i found out that vi_spacy only works with spacy v3.0.8 but ja_core_news_trf need spaCy >=3.2.0,<3.3.0 and is incompatible with the current version. After typing python -m spacy info i get this error message:
UserWarning: [W095] Model 'ja_core_news_trf' (3.2.0) requires spaCy >=3.2.0,<3.3.0 and is incompatible with the current version (3.0.8). This may lead to unexpected results or runtime errors. To resolve this, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate
after running python -m spacy validate i get this:
ja_core_news_trf >=3.2.0,<3.3.0 3.2.0 --> n/a
vi_core_news_lg >=3.0.5,<3.1.0 0.0.1 ✔
The following packages are custom spaCy pipelines or not available for spaCy
v3.0.8:
ja_core_news_trf
So my question how can i retrain the custom vietnamese model with the current spaCy version? Of course i tried to contact the developer but he doesn't reply, so i wanted to do it myself it it's possible.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在几乎所有情况下,spaCy v3 模型都与较新版本的 spaCy v3 向前兼容,因此请下载
ja_core_news_trf
,然后使用pip install --no-deps
安装越南模型,以便pip
不会安装旧版本的spacy
作为依赖项。您会在加载时收到一条警告,提示旧模型可能不兼容,但是在您的数据上进行测试,只要性能与旧版本的 spacy 相同,就应该没问题使用。
参阅:https://spacy.io/usage/v3-2#upgrading
请 仅当您有权访问原始训练数据时才能重新训练模型。
In nearly all cases spaCy v3 models are forwards-compatible with newer versions of spaCy v3, so download
ja_core_news_trf
and then install the Vietnamese model withpip install --no-deps
so thatpip
doesn't install an older version ofspacy
as a dependency.You'll get a warning on load that an older model might be incompatible, but test it on your data and as long as the performance is the same as with the older version of spacy, it should be fine to use.
See: https://spacy.io/usage/v3-2#upgrading
You can only retrain the model if you have access to the original training data.