如何从本地磁盘加载NLTK停止字或单词
我需要从本地磁盘加载nltk“单词”数据。在笔记本中,我的代码看起来如下,
import nltk
nltk.data.path.append("/data") # Setting path here
nltk.corpus.words.words()
但是我会收到以下错误,
LookupError Traceback (most recent call last)
/anaconda3/lib/python3.8/site-packages/nltk/corpus/util.py in __load(self)
83 try:
---> 84 root = nltk.data.find(f"{self.subdir}/{zip_name}")
85 except LookupError:
/anaconda3/lib/python3.8/site-packages/nltk/data.py in find(resource_name, paths)
582 resource_not_found = f"\n{sep}\n{msg}\n{sep}\n"
--> 583 raise LookupError(resource_not_found)
584
LookupError:
**********************************************************************
Resource words not found.
Please use the NLTK Downloader to obtain the resource:
>>> import nltk
>>> nltk.download('words')
For more information see: https://www.nltk.org/data.html
Attempted to load corpora/words.zip/words/
Searched in:
- '/home/my_user_name/nltk_data'
- '/anaconda3/nltk_data'
- '/anaconda3/share/nltk_data'
- '/anaconda3/lib/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
- '/data'
我从这里使用了手动安装部分, https://www.nltk.org/data.html 但是,我想从笔记本上设置路径,而不是NLTK_DATA。
有帮助吗?提前致谢。
I would need to load nltk 'words' data from local disk. In the notebook my code looks like the following,
import nltk
nltk.data.path.append("/data") # Setting path here
nltk.corpus.words.words()
But I am getting error as follows,
LookupError Traceback (most recent call last)
/anaconda3/lib/python3.8/site-packages/nltk/corpus/util.py in __load(self)
83 try:
---> 84 root = nltk.data.find(f"{self.subdir}/{zip_name}")
85 except LookupError:
/anaconda3/lib/python3.8/site-packages/nltk/data.py in find(resource_name, paths)
582 resource_not_found = f"\n{sep}\n{msg}\n{sep}\n"
--> 583 raise LookupError(resource_not_found)
584
LookupError:
**********************************************************************
Resource words not found.
Please use the NLTK Downloader to obtain the resource:
>>> import nltk
>>> nltk.download('words')
For more information see: https://www.nltk.org/data.html
Attempted to load corpora/words.zip/words/
Searched in:
- '/home/my_user_name/nltk_data'
- '/anaconda3/nltk_data'
- '/anaconda3/share/nltk_data'
- '/anaconda3/lib/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
- '/data'
I used the manual installation part from here, https://www.nltk.org/data.html
But, instead of NLTK_DATA, I wanted to set the path from the notebook.
Any help? Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这完全令人沮丧的是,如此受欢迎的软件包NLTK发出了误导性的消息。
以下消息是错误的!
试图加载coldera/words.zip/words/
它不是试图从words.zip加载colder.zip用文件夹单词加载coldera。它试图从nltk_data/corpora/单词加载。
尝试加载nltk_data/coldora/wisd/
o,解决方案是按以下方式手动添加正确的路径,
nltk.data.path.append(“ ./ data/nltk_data”)
,然后将未拉链的文件放入其中,例如'Words'文件夹。然后访问单词
This is completely frustrating that such a popular package nltk gives misleading messages.
The following message is just wrong!
Attempted to load corpora/words.zip/words/
It is not trying to load corpora from words.zip with folder words. It was attempting to load from nltk_data/corpora/words.
Attempted to load nltk_data/corpora/words/
so, solution is to add correct path manually as follows,
nltk.data.path.append("./data/nltk_data")
and then, put the unzipped file inside this, for example 'words' folder. And then access words as,