使用 WordNet 和 NLTK 替换语料库中的同义词 - python
尝试编写简单的 python 脚本,该脚本将使用 NLTK 查找并替换 txt 文件中的同义词。
以下代码给我错误:
Traceback (most recent call last):
File "C:\Users\Nedim\Documents\sinon2.py", line 21, in <module>
change(word)
File "C:\Users\Nedim\Documents\sinon2.py", line 4, in change
synonym = wn.synset(word + ".n.01").lemma_names
TypeError: can only concatenate list (not "str") to list
这是代码:
from nltk.corpus import wordnet as wn
def change(word):
synonym = wn.synset(word + ".n.01").lemma_names
if word in synonym:
filename = open("C:/Users/tester/Desktop/test.txt").read()
writeSynonym = filename.replace(str(word), str(synonym[0]))
f = open("C:/Users/tester/Desktop/test.txt", 'w')
f.write(writeSynonym)
f.close()
f = open("C:/Users/tester/Desktop/test.txt")
lines = f.readlines()
for i in range(len(lines)):
word = lines[i].split()
change(word)
Trying to write simple python script which will use NLTK to find and replace synonyms in txt file.
Following code gives me error:
Traceback (most recent call last):
File "C:\Users\Nedim\Documents\sinon2.py", line 21, in <module>
change(word)
File "C:\Users\Nedim\Documents\sinon2.py", line 4, in change
synonym = wn.synset(word + ".n.01").lemma_names
TypeError: can only concatenate list (not "str") to list
Here is code:
from nltk.corpus import wordnet as wn
def change(word):
synonym = wn.synset(word + ".n.01").lemma_names
if word in synonym:
filename = open("C:/Users/tester/Desktop/test.txt").read()
writeSynonym = filename.replace(str(word), str(synonym[0]))
f = open("C:/Users/tester/Desktop/test.txt", 'w')
f.write(writeSynonym)
f.close()
f = open("C:/Users/tester/Desktop/test.txt")
lines = f.readlines()
for i in range(len(lines)):
word = lines[i].split()
change(word)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这不是非常有效,并且不会取代单个同义词。因为每个单词可能有多个同义词。您可以从中选择,
This isn't terribly efficient, and this would not replace a single synonym. because there could be multiple synonyms for each word. Which you could chose from,
有两件事。首先,您可以将文件读取部分更改为:
其次,
.split()
返回字符串列表,而您的change
函数似乎仅对单个单词进行操作一次。这就是导致异常的原因。您的单词
实际上是一个列表。如果您想处理该行上的每个单词,请使其看起来像:
Two things. First, you can change the file reading portion to:
And second,
.split()
returns a list of strings, whereas yourchange
function appears to only operate on a single word at a time. This is what's causing the exception. Yourword
is actually a list.If you want to process every word on that line, make it look like: