选择nlp api (java)
我对 NLP 领域(非常)陌生,我试图寻找一个 API(Java 语言) 可以告诉我两段文本是否具有相同的含义(或者一段文本是否由另一段文本衍生而来) 例如:
“比利说汤姆是个好孩子”
与
“根据比利的说法,汤姆是个好孩子”
我检查了 GATE
和 openNlp
,似乎 GATE
只提供用于注释的 API 并且 openNlp 也不支持它。
I am (very) new to the field of NLP, I tried to look for an API (in Java) that
can tell me if two pieces of text have the same meaning (or if one is derived by the other)
for example:
"billy said tom was a nice kid"
is the same as
"tom is a nice kid according to billy"
I checked GATE
and openNlp
and it seems like GATE
only offers API for annotations
and openNlp
doesnt support it as well.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
Omri,现有的 Java 或其他编程语言软件都无法告诉您这一点。文本理解是自然语言处理的圣杯。
我建议你先从做较小的任务开始,然后逐步完成这项艰巨的任务。
请参阅此问题和nlp 上的answers.com 页面 获取一些指导。
文本蕴含,一个活跃的研究领域,可能是接近你所问的内容。
Omri, no existing piece of software, in Java or another programming language, can tell you this. Text understanding is the holy grail of natural language processing.
I suggest you start by doing smaller tasks, and gradually approach this vast task.
Please see this question and the answers.com page on nlp for some pointers.
Textual Entailment, an active research area, may be close to what you are asking about.
您可以尝试 Cortical.io 的 Retina API:它使用多种距离度量(余弦相似度、Jacquard 距离、欧几里得距离...)来测量任意两个文本的语义相似度。您甚至可以获得语义重叠的视觉表示。
You can try the Retina API from Cortical.io: it measures the semantic similarity of any two texts using several distance measures (Cosine Similarity, Jacquard Distance, Euclidian Distance...). You can even get a visual representation of the semantic overlap.