可视化文档相似点
我们目前正在做一个关于两个文本文档的抄袭检测的项目。我们必须比较两个提交的文件并呈现比较结果。为此,我想并排呈现两个文档,并在 GUI 中突出显示文档之间的相似点。我使用了各种算法来获取两个文档之间的相似度得分,例如向量空间和木瓦云算法。但它们没有提供具有相似性的部分,我必须将它们呈现给用户,其中相似性出现在图形界面中。
谢谢 女宛
We are currently doing a project on plagiarism detection of two text document. We have to compare two submitted documents and present the comparison results. For that I want to present the two documents side by side and highlight the similarity points between the documents in a GUI. I used various algorithms to get the similarity score between two documents such as vector space and shingle cloud algorithms. but they dont provide the sections that have the similarity and I have to present them to the user where the similarity occurs in a graphical interface.
Thanx
Nuwan
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
它真的应该是图形化的吗?您正在比较文本。看来您想坚持使用文本界面。然而,您可以使用 Swing 快速创建一些东西。我可能会首先打印出文档共享的木瓦以及一些上下文。我还尝试寻找一些您可以使用的现成差异引擎,但没有找到。也许您实际上可以以某种方式将 Unix
diff
工具整合到您的应用程序中?Should it really be graphical? You're comparing text. It seems like you'd want to stick with a textual interface. However, you could create something pretty quickly with Swing. I'd probably start by printing out the shingles that the documents share in common along with some context. I also tried searching for some off-the-shelf diff engine you could use but came up short. Maybe you could actually somehow shell out to or somehow incorporate the Unix
diff
tool into your application?