了解 OpenCalais 的相关性分数
我试图了解 opencalais 返回的与每个实体相关的相关性分数是多少?它意味着什么以及如何解释?我将感谢对此的见解。
I am trying to understand what is the relevance score that opencalais returns associated with each entity? What does it signify and how is it to be interpreted? I would be thankful for insights into this.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
他们的文档指出:相关性功能检测每个唯一实体的重要性,并分配 0-1 范围内的相关性分数(1 表示最相关和最重要)。
虽然他们没有解释“相关性”的确切含义,但人们期望它能够量化实体对文档话语的中心地位。它可能受到一些因素的影响,例如与随机文档中的预期频率相比,实体在本文档中提及的频率(参见 TF-IDF),但也可能涉及更复杂的话语分析。
Their documentation states: The relevance capability detects the importance of each unique entity and assigns a relevance score in the range 0-1 (1 being the most relevant and important).
While they do not explain what 'relevance' means exactly, one would expect it to quantify the centrality of the entity to the discourse of the document. It's likely influenced by factors such as the entities mention frequency in this document as compared to its expected frequency in a random document (cf. TF-IDF), but could also involve more sophisticated discourse analysis.