使用 synonym() 从 wordnet 中提取同义词
假设我通过 synonym() 函数从 wordnet 中提取“help”的同义词,并得到以下结果:
Str = synonyms("help")
Str
[1] "c(\"aid\", \"assist\", \"assistance\", \"help\")"
[2] "c(\"aid\", \"assistance\", \"help\")"
[3] "c(\"assistant\", \"helper\", \"help\", \"supporter\")"
[4] "c(\"avail\", \"help\", \"service\")"
然后我可以在最后使用一个字符串
unique(unlist(lapply(parse(text=Str),eval)))
,如下所示:
[1] "aid" "assist" "assistance" "help" "assistant" "helper" "supporter"
[8] "avail" "service"
上述过程是由 Gabor Grothendieck 建议的。他/她的解决方案很好,但我仍然无法弄清楚,如果我将查询词更改为“公司”、“男孩”或其他人,则会响应错误消息。
一个可能的原因可能是“公司”的“第六个”同义词(请参见下文)是一个单独的术语,并且不遵循“c(\”公司\“)”的格式。
synonyms("company")
[1] "c(\"caller\", \"company\")"
[2] "c(\"company\", \"companionship\", \"fellowship\", \"society\")"
[3] "c(\"company\", \"troupe\")"
[4] "c(\"party\", \"company\")"
[5] "c(\"ship's company\", \"company\")"
[6] "company"
有人可以帮我解决这个问题吗? 非常感谢。
Supposed I am pulling the synonyms of "help" by the function of synonyms() from wordnet and get the followings:
Str = synonyms("help")
Str
[1] "c(\"aid\", \"assist\", \"assistance\", \"help\")"
[2] "c(\"aid\", \"assistance\", \"help\")"
[3] "c(\"assistant\", \"helper\", \"help\", \"supporter\")"
[4] "c(\"avail\", \"help\", \"service\")"
Then I can get a one character string using
unique(unlist(lapply(parse(text=Str),eval)))
at the end that looks like this:
[1] "aid" "assist" "assistance" "help" "assistant" "helper" "supporter"
[8] "avail" "service"
The above process was suggested by Gabor Grothendieck. His/Her solution is good, but I still couldn't figure out that if I change the query term into "company", "boy", or someone else, an error message will be responsed.
One possible reason maybe due to the "sixth" synonym of "company" (please see below) is a single term and does not follow the format of "c(\"company\")".
synonyms("company")
[1] "c(\"caller\", \"company\")"
[2] "c(\"company\", \"companionship\", \"fellowship\", \"society\")"
[3] "c(\"company\", \"troupe\")"
[4] "c(\"party\", \"company\")"
[5] "c(\"ship's company\", \"company\")"
[6] "company"
Could someone kindly help me to solve this problem.
Many thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以通过创建一个小辅助函数来解决这个问题,该函数使用 R 的
try
机制来捕获错误。在这种情况下,如果eval
产生错误,则返回原始字符串,否则返回eval
的结果:创建一个辅助函数:
产生:
我重现了您的数据并然后使用 dput 在这里重现它:
You can solve this by creating a little helper function that uses R's
try
mechanism to catch errors. In this case, if theeval
produces an error, then return the original string, else return the result ofeval
:Create a helper function:
Produces:
I reproduced your data and then used
dput
to reproduce it here:这些同义词的形式看起来像表达式,因此您应该能够按照所示方式解析它们。但是:当我执行上面的原始代码时,我从同义词调用中收到错误,因为您没有包含词性参数。
请观察
synonyms
的代码使用getSynonyms
并且它的代码有一个unique
包裹着它,所以您正在做的所有预处理都是不再需要(如果您更新);:Those synonyms are in a form that looks like an expression, so you should be able to parse them as you illustrated. BUT: When I execute your original code above I get an error from the synonyms call because you included no part-of-speech argument.
Observe that the code of
synonyms
usesgetSynonyms
and that its code has aunique
wrapped around it so all of the pre-processing you are doing is no longer needed (if you update);: