比较存储在 mysql 数据库中的 SIFT 特征

发布于 2024-09-14 22:11:42 字数 722 浏览 9 评论 0原文

我目前正在扩展一个用于对图像进行分类的图像库,我想查找重复的图像、转换后的图像以及包含其他图像或包含在其他图像中的图像。
我已经测试了 OpenCV 的 SIFT 实现,它工作得很好,但对于多个图像来说会相当慢。太快了,我想我可以提取特征并将它们保存在数据库中,因为许多其他图像相关的元数据已经保存在那里。

将新图像的特征与数据库中的特征进行比较的最快方法是什么?
通常比较是使用 kd-trees、FLANN 或使用 金字塔匹配内核是我在另一个线程中找到的,但还没有深入研究。

由于我不知道如何有效地在数据库中保存和搜索 kd 树,因此我目前只看到三个选项:
* 让 MySQL 计算到数据库中每个特征的欧氏距离,尽管我确信对于多个图像来说这将花费不合理的时间。
* 首先将整个数据集加载到内存中并构建 kd 树。这可能会很快,但非常占用内存。另外,所有数据都需要从数据库传输。
* 将生成的树保存到数据库中并加载所有树,这将是最快的方法,但也会产生大量流量,因为新图像必须重建 kd 树并将其发送到服务器。

我正在使用 OpenCV 的 SIFT 实现,但我并不死心塌地。如果有一个更适合此任务的特征提取器(并且大致同样强大),我很高兴有人能推荐一个。

I'm currently extending an image library used to categorize images and i want to find duplicate images, transformed images, and images that contain or are contained in other images.
I have tested the SIFT implementation from OpenCV and it works very well but would be rather slow for multiple images. Too speed it up I thought I could extract the features and save them in a database as a lot of other image related meta data is already being held there.

What would be the fastest way to compare the features of a new images to the features in the database?
Usually comparison is done calculating the euclidean distance using kd-trees, FLANN, or with the Pyramid Match Kernel that I found in another thread here on SO, but haven't looked much into yet.

Since I don't know of a way to save and search a kd-tree in a database efficiently, I'm currently only seeing three options:
* Let MySQL calculate the euclidean distance to every feature in the database, although I'm sure that that will take an unreasonable time for more than a few images.
* Load the entire dataset into memory at the beginning and build the kd-tree(s). This would probably be fast, but very memory intensive. Plus all the data would need to be transferred from the database.
* Saving the generated trees into the database and loading all of them, would be the fastest method but also generate high amounts of traffic as with new images the kd-trees would have to be rebuilt and send to the server.

I'm using the SIFT implementation of OpenCV, but I'm not dead set on it. If there is a feature extractor more suitable for this task (and roughly equally robust) I'm glad if someone could suggest one.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

饮惑 2024-09-21 22:11:42

所以几年前我基本上做了一些与此非常相似的事情。 您想要研究的算法是几年前由 David Nister 提出的,论文是:“Scalable Recognition with a Vocabulary Tree”。他们几乎对您的问题有一个精确的解决方案,可以扩展到数百万张图像。

这是摘要的链接,您可以通过谷歌搜索标题找到下载链接。
http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1641018 被

基本思想是使用分层 k 均值算法构建一棵树来对特征进行建模,然后利用该树中特征的稀疏分布来快速找到最近的邻居......或者类似的东西,它已经 自从我从事这件事以来已有几年了。您可以在作者网页上找到 Powerpoint 演示文稿:http://www. vis.uky.edu/~dnister/Publications/publications.html

其他一些说明:

  • 我不会为金字塔匹配内核而烦恼,它实际上比重复/转换图像检测更能提高对象识别能力.

  • 我不会将任何此类功能存储在 SQL 数据库中。根据您的应用程序,有时动态计算特征会更有效,因为在密集计算时它们的大小可能会超过原始图像的大小。特征直方图或指向词汇树中节点的指针效率更高。

  • SQL 数据库并不是为进行大规模浮点向量计算而设计的。 您可以在数据库中存储内容,但不要将其用作计算工具。我用 SQLite 尝试过一次,结果非常糟糕。

  • 如果您决定实现这一点,请详细阅读本文并在实现时随身携带一份副本,因为有许多小细节对于使算法高效工作非常重要。

So I basically did something very similar to this a few years ago. The algorithm you want to look into was proposed a few years ago by David Nister, the paper is: "Scalable Recognition with a Vocabulary Tree". They pretty much have an exact solution to your problem that can scale to millions of images.

Here is a link to the abstract, you can find a download link by googleing the title.
http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1641018

The basic idea is to build a tree with a hierarchical k-means algorithm to model the features and then leverage the sparse distribution of features in that tree to quickly find your nearest neighbors... or something like that, it's been a few years since I worked on it. You can find a powerpoint presentation on the authors webpage here: http://www.vis.uky.edu/~dnister/Publications/publications.html

A few other notes:

  • I wouldn't bother with the pyramid match kernel, it's really more for improving object recognition than duplicate/transformed image detection.

  • I would not store any of this feature stuff in an SQL database. Depending on your application it is sometimes more effective to compute your features on the fly since their size can exceed the original image size when computed densely. Histograms of features or pointers to nodes in a vocabulary tree are much more efficient.

  • SQL databases are not designed for doing massive floating point vector calculations. You can store things in your database, but don't use it as a tool for computation. I tried this once with SQLite and it ended very badly.

  • If you decide to implement this, read the paper in detail and keep a copy handy while implementing it, as there are many minor details that are very important to making the algorithm work efficiently.

心房的律动 2024-09-21 22:11:42

我认为关键是这不是 SIFT 问题。这是一个关于近似最近邻搜索的问题。与图像匹配一样,这也是一个开放的研究问题。您可以尝试谷歌搜索“近似最近邻搜索”,看看有哪些类型的方法可用。如果您需要精确的结果,请尝试:“精确最近邻搜索”。

所有这些几何数据结构(例如 kd 树)的性能都会随着维数的增加而降低,因此我认为关键是您可能需要以较少的维数(例如 10-30)来表示 SIFT 描述符256-1024)以进行真正高效的最近邻搜索(例如使用 PCA)。

一旦你有了这个,我认为无论数据是否存储在 MySQL 中,它都将成为次要的。

The key, I think, is that is this isn't a SIFT question. It is a question about approximate nearest neighbor search. Like image matching this too is an open research problem. You can try googling "approximate nearest neighbor search" and see what type of methods are available. If you need exact results, try: "exact nearest neighbor search".

The performace of all these geometric data structures (such as kd-trees) degrade as the number of dimensions increase, so the key I think is that you may need to represent your SIFT descriptors in a lower number of dimensions (say 10-30 instead of 256-1024) to have really efficient nearest neighbor searches (use PCA for example).

Once you have this I think it will become secondary if the data is stored in MySQL or not.

北城半夏 2024-09-21 22:11:42

我认为速度不是这里的主要问题。主要问题是如何使用这些功能来获得您想要的结果。

如果你想对图像进行分类(例如人、车、房子、猫),那么 Pyramid Match 内核绝对值得一看。它实际上是局部特征描述符的直方图,因此不需要将各个特征相互比较。还有一类被称为“词袋”的算法,它试图对局部特征进行聚类以形成“视觉词汇”。同样,在这种情况下,一旦你有了“视觉词”,你就不需要计算所有 SIFT 描述符对之间的距离,而是确定每个特征属于哪个簇。另一方面,如果您想获得图像对之间的点对应关系,例如确定一个图像是否包含在另一个图像中,或者计算图像之间的变换,那么您确实需要找到精确的最近邻。

此外,还有 SIFT 以外的局部特征。例如,SURF 的特征与 SIFT 类似,但提取速度更快,并且已被证明对于某些任务表现更好。

如果您只想查找重复项,则可以使用全局图像描述符(例如颜色直方图)来删除明显不同的图像,从而大大加快搜索速度。比较两个颜色直方图比比较两个分别包含数百个 SIFT 特征的集合要快几个数量级。您可以使用颜色直方图创建简短的候选列表,然后使用 SIFT 优化搜索。

I think speed is not the main issue here. The main issue is how to use the features to get the results you want.

If you want to categorize the images (e. g. person, car, house, cat), then the Pyramid Match kernel is definitely worth looking at. It is actually a histogram of the local feature descriptors, so there is no need to compare individual features to each other. There is also a class of algorithms known as the "bag of words", which try to cluster the local features to form a "visual vocabulary". Again, in this case once you have your "visual words" you do not need to compute distances between all pairs of SIFT descriptors, but instead determine which cluster each feature belongs to. On the other hand, if you want to get point correspondences between pairs of images, such as to decide whether one image is contained in another, or to compute the transformation between the images, then you do need to find the exact nearest neighbors.

Also, there are local features other than SIFT. For example SURF are features similar to SIFT, but they are faster to extract, and they have been shown to perform better for certain tasks.

If all you want to do is to find duplicates, you can speed up your search considerably by using a global image descriptor, such as a color histogram, to prune out images that are obviously different. Comparing two color histograms is orders of magnitude faster than comparing two sets each containing hundreds of SIFT features. You can create a short list of candidates using color histograms, and then refine your search using SIFT.

驱逐舰岛风号 2024-09-21 22:11:42

我有一些 python 工具,你可以在这里使用。基本上它是一个使用 SIFT 变换向量的包,然后计算每个 128d sift 向量的最近网格散列。散列是重要的部分,因为它是局部敏感的,简单地说,R^n 空间中靠近的向量会产生等效的散列冲突概率。我提供的工作是 Andoni 的扩展,它提供了用于修剪 LSH 的查询自适应启发式方法精确的搜索列表,以及哈希函数的优化 CUDA 实现。我还有一个小应用程序,可以进行图像数据库搜索,并提供良好的视觉反馈,所有这些都在 bsd 下(SIFT 除外,它有一些额外的限制)。

I have some tools in python you can play with here . Basically its a package that uses SIFT transformed vectors, and then computes a nearest lattice hashing of each 128d sift vector. The hashing is the important part, as it is locality sensitive, simply meaning that vectors near in R^n space result in equivalent hash collision probabilities. The work I provide is an extension of Andoni that provides a query adaptive heuristic for pruning the LSH exact search lists, as well as an optimized CUDA implementation of the hashing function. I also have a small app that does image database search with nice visual feedback, all under bsd (exception is SIFT which has some additional restrictions).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文