单词消歧算法(Lesk算法)
嗨.. 任何人都可以帮助我在 Java 代码中找到一种算法,以根据上下文查找搜索词的同义词,我想用 WordNet 数据库实现该算法。
例如,“我正在运行一个Java程序”。从上下文中,我想找到“跑步”这个词的同义词,但同义词必须根据上下文合适。
Hii..
Can anybody help me to find an algorithm in Java code to find synonyms of a search word based on the context and I want to implement the algorithm with WordNet database.
For example, "I am running a Java program". From the context, I want to find the synonyms for the word "running", but the synonyms must be suitable according to a context.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
让我说明一种可能的方法:
AB C
{A:(a1, a2, a3), B:(b1), C:(c1 , c2)}
(a1, b1, c1), (a1, b1, c2), (a2, b1, c1) ... (a3, b1, c2)
F(a, b, c)
,它返回(a, b, c) 之间的距离(分数)。对于初学者来说,函数 F 可以只返回两个节点之间的节点数倒数的乘积:
Maximize(Product[i=0 to len(sentence); j=0 to len(sentence)] (1/D (node_i, node_j)))
稍后,您可以增加其复杂性。
Let me illustrate a possible approach:
A B C
{A:(a1, a2, a3), B:(b1), C:(c1, c2)}
(a1, b1, c1), (a1, b1, c2), (a2, b1, c1) ... (a3, b1, c2)
F(a, b, c)
which returns the distance (score) between (a, b, c).For starters, the function F can just return the product of the inverse of the number of nodes between the two nodes:
Maximize(Product[i=0 to len(sentence); j=0 to len(sentence)] (1/D(node_i, node_j)))
Later on, you can increase its complexity.
此是解决您问题的完美文档。该算法的acc虽然不高,但我认为已经足够了。
在此链接上,您可以找到用于 WordNet 搜索 (JAWS) 的 Java API。
This is the perfect document for your problem. The acc of the algorithm is not high but I think it will be enough .
On this link you can find a Java API for WordNet Searching (JAWS).
嗨,当我搜索 lesk 算法实现。
我认为它是 JAWS 包的一部分。
我还没用过,但我想这会有所帮助
Hi i got to have a look at this page when i was searching for lesk algorithm implementations .
I think it comes as a part of the JAWS package .
i havent used it yet , but i guess this will help