使用当前的 spacy 版本重新训练自定义语言模型（兼容性问题）

发布于 2025-01-17 07:13:55 字数 1469 浏览 1 评论 0原文

我已经安装了带有两种语言模型的 spacy： vi_spacy （不是我提供的），这是来自的 spacy 自定义越南语语言模型这里的 Github 和日本的 ja_core_news_trf 模型。我首先使用 python -m spacy download ja_core_news_trf 安装了 ja_core_news_trf 模型在 anaconda 命令行中运行命令，它工作没有问题。然后，当我在命令行中使用 vi_spacy 安装并尝试它时，它起作用了。但当我尝试时，日本模型不再起作用了。

每次我收到此错误：

OSError: [E050] Can't find model 'ja_core_news_trf'. It doesn't seem to be a Python package or a valid path to a data directory.

即使当我输入 pip list 命令时，ja_core_news_trf 已安装。经过调查，我发现 vi_spacy 仅适用于 spacy v3.0.8，但 ja_core_news_trf 需要 spaCy >=3.2.0，<3.3.0 并且与当前版本不兼容。输入 python -m spacy info 后，我收到此错误消息：

UserWarning: [W095] Model 'ja_core_news_trf' (3.2.0) requires spaCy >=3.2.0,<3.3.0 and is incompatible with the current version (3.0.8). This may lead to unexpected results or runtime errors. To resolve this, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate

运行 python -m spacy validate 后，我收到此消息：

   ja_core_news_trf   >=3.2.0,<3.3.0   3.2.0     --> n/a
    vi_core_news_lg    >=3.0.5,<3.1.0   0.0.1     ✔

 The following packages are custom spaCy pipelines or not available for spaCy
v3.0.8:
ja_core_news_trf

那么我的问题如何使用当前的 spaCy 版本重新训练自定义越南模型？当然，我尝试联系开发商，但他没有回复，所以我想自己做，这是可能的。

原文

i have installed spacy with two language models: vi_spacy (not from me), which is a custom Vietnamese language model for spacy from Github here and the japanese ja_core_news_trf model.
I first installed the ja_core_news_trf model with the python -m spacy download ja_core_news_trf
command in anaconda command line and it worked without a problem. Then when i installed vi_spacy using in the command line and trying it out it worked. But when i tried the japanese model didn't work anymore.

Each time i get this error:

OSError: [E050] Can't find model 'ja_core_news_trf'. It doesn't seem to be a Python package or a valid path to a data directory.

even though when i type pip list command ja_core_news_trf is installed.
After investigating i found out that vi_spacy only works with spacy v3.0.8 but ja_core_news_trf need spaCy >=3.2.0,<3.3.0 and is incompatible with the current version. After typing python -m spacy info i get this error message:

UserWarning: [W095] Model 'ja_core_news_trf' (3.2.0) requires spaCy >=3.2.0,<3.3.0 and is incompatible with the current version (3.0.8). This may lead to unexpected results or runtime errors. To resolve this, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate

after running python -m spacy validate i get this:

   ja_core_news_trf   >=3.2.0,<3.3.0   3.2.0     --> n/a
    vi_core_news_lg    >=3.0.5,<3.1.0   0.0.1     ✔

 The following packages are custom spaCy pipelines or not available for spaCy
v3.0.8:
ja_core_news_trf

So my question how can i retrain the custom vietnamese model with the current spaCy version? Of course i tried to contact the developer but he doesn't reply, so i wanted to do it myself it it's possible.

分享到QQ

分享到微博