当前位置：文江博客话题详情

导入已经下载的Spacy语言模型以docker容器，而无需新下载

发布于 2025-02-03 07:11:00 字数 494 浏览 5 评论 0原文

我想在各种Docker容器上运行多种Spacy语言模型。我不希望Docker映像包含行运行Python -M Spacy下载en_core_web_lg，因为其他过程可能具有不同的语言模型。

我的问题是：是否可以将多个Spacy语言模型下载到本地（即en_core_web_lg，en_core_web_md，...），然后在docker容器产生时加载这些模型？

此过程可能具有以下步骤：

Spawn Docker容器并将卷“ Lanaging_models/”绑定到包含许多Spacy模型的容器。
运行某些spacy命令，例如python -m spacy下载-laguage_models/en_core_web_lg哪个指向您想要环境的语言模型。

希望是，由于语言模型已经存在于共享卷上，因此每个新容器的下载/导入时间大大减少。每个容器上也没有不必要的语言模型，并且Docker映像根本不会针对任何语言模型。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

九厘米的零° 2025-02-10 07:11:01

感谢您的评论 @polm23！由于Spacy模型最终被用来训练RASA模型，因此我还有一个额外的复杂性。我选择的解决方案是使用：

nlp = spacy.load(model)
nlp.to_disk(f'language_models/{model}')

然后使用安装的卷使Docker容器可见特定的模型目录。然后，在RASA中，您可以使用本地路径导入语言模型

# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: "../../language_models/MODEL_NAME"
recipe: default.v1

Thanks for the comment @polm23! I had an additional layer of complexity since the SpaCy model was ultimately used to train a Rasa model. The solution I've opted for is to save models locally using:

nlp = spacy.load(model)
nlp.to_disk(f'language_models/{model}')

And then make the specific model directory visible to the docker container using a mounted volume. Then, in Rasa anyway, you can import the language model using a local path

# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: "../../language_models/MODEL_NAME"
recipe: default.v1

回复收藏 0 原文

独留℉清风醉 2025-02-10 07:11:00

有两种方法可以做到这一点。

更容易使用模型目录将音量安装在Docker中，并将其指定为路径。 spacy允许您调用spacy.load（“某些/路径”），因此不需要PIP安装。

如果您确实需要使用PIP安装某些内容，也可以下载zipped型号并安装该文件。但是，默认情况下可能涉及制作副本，从而降低收益。如果您可以使用pipe -e（可编辑）的模型下载和安装，通常用于DevelPoment。我不建议这样做，但是如果您使用导入en_core_web_sm或其他东西，并且难以重构，则可能是您想要的。

回复收藏 0 原文

~没有更多了~