在 Google colab 上使用 mallet 时出现错误代码 126/127
from gensim.models.wrappers import LdaMallet
# mallet_path = 'C:/Users/kmuth/Downloads/mallet-2.0.8/bin/mallet' # update this path
mallet_path = '/content/drive/MyDrive/data/mallet/mallet-2.0.8/bin/mallet'
ldamallet = gensim.models.wrappers.LdaMallet(mallet_path, corpus=doc_term_matrix, num_topics=15, id2word=dictionary)
我目前遇到这个错误: CalledProcessError:命令'/content/drive/MyDrive/data/mallet/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+ “ --input /tmp/d20b66_corpus.txt --output /tmp/d20b66_corpus.mallet' 返回非零退出状态 126。
当尝试正如我读到的那样,使用 mallet 创建一个 lda 模型比 gensim 包中内置的 lda 模型做得更好。
我正在尝试遵循本教程(尝试使用木槌位): https://www.machinelearningplus.com/nlp/topic-modeling -gensim-python/#14computemodelperplexityandcoherencescore
我真的很感激任何帮助,因为我不知道错误是什么。是不是找不到这个文件,需要安装吗?几乎是一个菜鸟
,我试图从我的电脑和我的驱动器更改木槌路径,但没有成功。
谢谢你,肖恩
from gensim.models.wrappers import LdaMallet
# mallet_path = 'C:/Users/kmuth/Downloads/mallet-2.0.8/bin/mallet' # update this path
mallet_path = '/content/drive/MyDrive/data/mallet/mallet-2.0.8/bin/mallet'
ldamallet = gensim.models.wrappers.LdaMallet(mallet_path, corpus=doc_term_matrix, num_topics=15, id2word=dictionary)
I'm currently having this error:
CalledProcessError: Command '/content/drive/MyDrive/data/mallet/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/d20b66_corpus.txt --output /tmp/d20b66_corpus.mallet' returned non-zero exit status 126.
when trying to create an lda model using mallet as i have read it somehow does a better job than the built in lda model in the gensim package.
i'm trying to follow this tutorial(to try the mallet bit):
https://www.machinelearningplus.com/nlp/topic-modeling-gensim-python/#14computemodelperplexityandcoherencescore
I would really appreciate any help as I don't have any idea to what the error is. Is it not finding the file, do I have to install it? Pretty much a noob
I have tried to change the mallet path around from my pc and from my drive with no avail.
Thank you, Sean
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
IIUC,Google Colab 运行在 Google 的服务器上。
您是否将本机
mallet
可执行文件(gensim.models.wrappers.LdaMallet
所依赖的文件)编译/安装为笔记本可访问的路径和格式,从中Google Colab笔记本可以执行吗? (Google Colab 中是否允许这样做?)另请注意,最新(4.0+)版本的 Gensim 已经消除了
包装器
,因为使用起来有些尴尬,难以维护,并且无法使用。与其他实现有些冗余。因此,您可能需要考虑使用 Gensim 自己的LdaModel
而不是这个不再受支持的另一个包的包装器。IIUC, Google Colab runs on Google's servers.
Did you compile/install the native
mallet
executable (on which thatgensim.models.wrappers.LdaMallet
depends) to a path and format – accessable to the notebook – from which the Google Colab notebook can execute it? (Is that even allowed in Google Colab?)Note also that the latest (4.0+) versions of Gensim have eliminated the
wrappers
as somewhat awkward to use, hard-to-maintain, & somewhat redundant with other implementations. So you may want to consider using Gensim's ownLdaModel
instead of this no-longer-supported wrapper of another package.