用于查找最匹配的二进制文件的实用程序或库

发布于 2024-08-18 02:36:12 字数 353 浏览 8 评论 0原文

我希望能够将二进制文件 X 与其他二进制文件的目录进行比较,并找到哪个其他文件与 X 最相似。数据的性质是文件之间存在相同的块,但可能会移动位置。文件大小均为1MB,大约有200个。我希望能够在现代台式计算机上以足够快的速度在几分钟或更短的时间内分析这些内容。 我用谷歌搜索了一下,发现了一些不同的二进制差异实用程序,但它们似乎都不适合我的应用程序。

例如,有 bsdiff,它看起来像是创建了一些大小优化的补丁文件。或者 vbindiff 它只是以图形方式显示差异,但这些似乎并不能真正帮助我弄清楚一个文件是否比另一个文件更类似于 X。

如果没有可以直接用于此目的的工具,是否有人可以推荐一个好的库来编写我自己的实用程序? Python 会更好,但我很灵活。

I would like to be able to compare a binary file X to a directory of other binary files and find which other file is most similar to X. The nature of the data is such that identical chunks will exist between files, but possibly shifted in location. The files are all 1MB in size, and there are about 200 of them. I would like to be have something quick enough to analyze these in a few minutes or less on a modern desktop computer.
I've googled a bit and found a few different binary diff utilities, but none of them seem appropriate for my application.

For example there is bsdiff, which looks like it creates some a patch file which is optimized for size. Or vbindiff which just displays the differences graphically, but those don't really seem to help me figure out if one file is more similar to X than another file.

If there is not a tool that I can use directly for this purpose, is there a good library someone could recommend for writing my own utility? Python would be preferable, but I'm flexible.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

浅紫色的梦幻 2024-08-25 02:36:12

这是一个简单的perl脚本,它或多或少地试图做到这一点。

编辑:另请查看以下 stackoverflow 线程

Here's a simple perl script which more or less tries to do exactly that.

Edit: Also have a look at the following stackoverflow thread.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文