将特征与数据库匹配的快速(呃)方法

发布于 2024-09-30 10:55:34 字数 991 浏览 15 评论 0原文

我正在开发一个项目,其中图像中有一个特征被描述为一组 X 和 X 。 Y 坐标(每个特征 5-10 个点)对于该特征来说是唯一的。我还有一个包含数千个特征的数据库,其中每个特征都有相同类型的描述符。结果看起来像这样:

myFeature: (x1,y1), (x2,y2), (x3,y3)...

myDatabase: Feature1: (x1,y1), (x2,y2), (x3,y3)...
            Feature2: (x1,y1), (x2,y2), (x3,y3)...
            Feature3: (x1,y1), (x2,y2), (x3,y3)...
            ...

我想在 myDatabase 的特征中找到 myFeature 的最佳匹配。

匹配这些功能的最快方法是什么?目前,我正在逐步浏览数据库中的每个功能并比较每个单独的点:

bestScore = 0
for each feature in myDatabase:
    score = 0
    for each point descriptor in MyFeature:
        find minimum distance from the current point to the...
          points describing the current feature in the database
        if the distance < threshold:
            there is a match to the current point in the target feature
            score += 1

    if score > bestScore:
        save feature as new best match

此搜索有效,但显然在大型数据库上它会变得非常慢。有谁知道进行此类搜索的更快方法,或者至少是否有一种方法可以快速排除明显与描述符不匹配的特征?

I'm working on a project where I have a feature in an image described as a set of X & Y coordinates (5-10 points per feature) which are unique for this feature. I also have a database with thousands of features where each have the same type of descriptor. The result looks like this:

myFeature: (x1,y1), (x2,y2), (x3,y3)...

myDatabase: Feature1: (x1,y1), (x2,y2), (x3,y3)...
            Feature2: (x1,y1), (x2,y2), (x3,y3)...
            Feature3: (x1,y1), (x2,y2), (x3,y3)...
            ...

I want to find the best match of myFeature in the features in myDatabase.

What is the fastest way to match these features? Currently I am stepping though each feature in the database and comparing each individual point:

bestScore = 0
for each feature in myDatabase:
    score = 0
    for each point descriptor in MyFeature:
        find minimum distance from the current point to the...
          points describing the current feature in the database
        if the distance < threshold:
            there is a match to the current point in the target feature
            score += 1

    if score > bestScore:
        save feature as new best match

This search works, but clearly it gets painfully slow on large databases. Does anyone know of a faster method to do this type of search, or at least if there is a way to quickly rule out features that clearly won't match the descriptor?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

入画浅相思 2024-10-07 10:55:34

根据每个特征创建一个位集(1 和 0 的数组)。

为您的搜索条件创建这样的位掩码,然后只需使用按位 和 将搜索掩码与您的功能进行比较。

通过这种方法,您可以将大部分工作转移到负责保存内容的例程中。此外,创建位掩码不应该是计算密集型的。

如果您只是想排除绝对无法匹配的功能,那么您的掩码创建算法应该考虑到这一点,并创建有点模糊的位掩码。

创建此类蒙版的最简单方法可能是创建一个与特征矩阵一样大的矩阵,并在为该特征设置的每个坐标中放置一个 1,在每个不是特征的坐标中放置一个零。然后将该矩阵转换为一维行。然后将特征行与搜索掩码按位进行比较。

这类似于位图索引在大型数据库(例如oracle)上的工作方式,但目的不同,并且没有内存中所有数据库行的完整位图图像。

其威力在于按位比较。

在 32 位机器上,当您在点比较中只能对整数进行一次比较时,您可以对每条指令执行 32 次比较。它为浮点运算产生更高的 boni,具体取决于架构。

Create a bitset (an array of 1s and 0s) from each feature.

Create such a bitmask for your search criteria and then just use a bitwise and to compare the search mask to your features.

With this approach, you can shift most work to the routines responsible for saving the stuff. Also, creating the bitmasks should not be that computationally intensive.

If you just want to rule out features that absolutely can't match, then your mask-creation algorithm should take care of that and create the bitmasks a bit fuzzy.

The easiest way to create such masks is probably by creating a matrix as big as the matrix of your features and put a one in every coordinate that is set for the feature and a zero in every coordinate that isn't. Then turn that matrix into a one dimensional row. Compare the feature-row then to the search mask bitwise.

This is similar to the way bitmap indexes work on large databases (oracle e.g.), but with a different intention and without a full bitmap-image of all database rows in memory.

The power of this is in the bitwise comparisons.

On a 32bit machine you can perform 32 comparisons per instruction when you can just do one with integer numbers in a point comparison. It yields even higher boni for floating point operations, depending on the architecture.

┊风居住的梦幻卍 2024-10-07 10:55:34

这通常看起来像一个空间索引问题。这不是我的领域,但您可能需要构建一种树索引,例如四叉树,您可以使用它来轻松搜索功能。您可以从这篇维基百科文章中找到一些链接:http://en.wikipedia.org/wiki/Spatial_index

这可能是一个可以在现有空间数据库中轻松实现的问题。它的描述非常像 GIS。

您可以做的一件事是计算每个特征的重心,并使用它来稍微缩小搜索空间(一维搜索更容易为其建立索引),但这有一个缺点,那就是只是一个启发式(根据特征的形状,重心可能会出现在奇怪的地方)。

This in general looks like a spatial index problem. It's not my field, but you'll probably need to build a sort of tree index, such as a quadtree, that you can use to easily search for features. You can find some links from this wikipedia article: http://en.wikipedia.org/wiki/Spatial_index

It might be a problem that you can easily implement in an existing spatial database. It's very GIS-like in its description.

One thing you can do is calculate a point of gravity for every feature and use that to whittle down the search space a bit (a one dimensional search is a lot easier to build an index for), but that has the downside of being just a heuristic (depending on the shapes of your feature, the point of gravity may end up in weird places).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文