在 Google colab 上使用 mallet 时出现错误代码 126/127

发布于 2025-01-10 18:08:46 字数 1076 浏览 0 评论 0原文

from gensim.models.wrappers import LdaMallet
# mallet_path = 'C:/Users/kmuth/Downloads/mallet-2.0.8/bin/mallet' # update this path
mallet_path = '/content/drive/MyDrive/data/mallet/mallet-2.0.8/bin/mallet'
ldamallet = gensim.models.wrappers.LdaMallet(mallet_path, corpus=doc_term_matrix, num_topics=15, id2word=dictionary)

我目前遇到这个错误: CalledProcessError:命令'/content/drive/MyDrive/data/mallet/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+ “ --input /tmp/d20b66_corpus.txt --output /tmp/d20b66_corpus.mallet' 返回非零退出状态 126。

当尝试正如我读到的那样,使用 mallet 创建一个 lda 模型比 gensim 包中内置的 lda 模型做得更好。

我正在尝试遵循本教程(尝试使用木槌位): https://www.machinelearningplus.com/nlp/topic-modeling -gensim-python/#14computemodelperplexityandcoherencescore

我真的很感激任何帮助,因为我不知道错误是什么。是不是找不到这个文件,需要安装吗?几乎是一个菜鸟

,我试图从我的电脑和我的驱动器更改木槌路径,但没有成功。

谢谢你,肖恩

from gensim.models.wrappers import LdaMallet
# mallet_path = 'C:/Users/kmuth/Downloads/mallet-2.0.8/bin/mallet' # update this path
mallet_path = '/content/drive/MyDrive/data/mallet/mallet-2.0.8/bin/mallet'
ldamallet = gensim.models.wrappers.LdaMallet(mallet_path, corpus=doc_term_matrix, num_topics=15, id2word=dictionary)

I'm currently having this error:
CalledProcessError: Command '/content/drive/MyDrive/data/mallet/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/d20b66_corpus.txt --output /tmp/d20b66_corpus.mallet' returned non-zero exit status 126.

when trying to create an lda model using mallet as i have read it somehow does a better job than the built in lda model in the gensim package.

i'm trying to follow this tutorial(to try the mallet bit):
https://www.machinelearningplus.com/nlp/topic-modeling-gensim-python/#14computemodelperplexityandcoherencescore

I would really appreciate any help as I don't have any idea to what the error is. Is it not finding the file, do I have to install it? Pretty much a noob

I have tried to change the mallet path around from my pc and from my drive with no avail.

Thank you, Sean

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

盛装女皇 2025-01-17 18:08:46

IIUC,Google Colab 运行在 Google 的服务器上。

您是否将本机 mallet 可执行文件(gensim.models.wrappers.LdaMallet 所依赖的文件)编译/安装为笔记本可访问的路径和格式,从中Google Colab笔记本可以执行吗? (Google Colab 中是否允许这样做?)

另请注意,最新(4.0+)版本的 Gensim 已经消除了包装器,因为使用起来有些尴尬,难以维护,并且无法使用。与其他实现有些冗余。因此,您可能需要考虑使用 Gensim 自己的 LdaModel 而不是这个不再受支持的另一个包的包装器。

IIUC, Google Colab runs on Google's servers.

Did you compile/install the native mallet executable (on which that gensim.models.wrappers.LdaMallet depends) to a path and format – accessable to the notebook – from which the Google Colab notebook can execute it? (Is that even allowed in Google Colab?)

Note also that the latest (4.0+) versions of Gensim have eliminated the wrappers as somewhat awkward to use, hard-to-maintain, & somewhat redundant with other implementations. So you may want to consider using Gensim's own LdaModel instead of this no-longer-supported wrapper of another package.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文