找到图中没有已知方程的最近点的有效算法

发布于 2024-11-25 20:14:15 字数 709 浏览 0 评论 0原文

我出于好奇而问这个问题,因为我的快速而肮脏的实现似乎已经足够好了。不过我很好奇更好的实现是什么。

我有一张真实世界数据的图表。图表中不存在重复的 X 值,并且 X 值以一致的速率递增,但 Y 数据基于现实世界的输出。我想以编程方式找到图表上距任意给定点 P 最近的点。我正在尝试找到一种有效(即快速)的算法来执行此操作。我不需要确切的最近点,我可以选择“几乎”最近的点。

明显的惰性解决方案是递增图中的每个点,计算距离,然后找到距离的最小值。然而,理论上对于大图来说这可能会很慢;对于我想要的东西来说太慢了。

由于我只需要一个近似最近的点,我想理想的最快方程将涉及生成一条最佳拟合线并使用该线实时计算该点应位于的位置;但这听起来像是一个潜在的数学难题,我不打算承担。

我的解决方案是一个 hack,它之所以有效,只是因为我假设我的点 P 不是任意的,即我假设 P 通常会接近我的图形线,当发生这种情况时,我可以从考虑中划掉远处的 X 值。我计算与 P 共享 X 坐标的线上点的距离,并使用该点与 P 之间的距离来计算可能是较近点的最大/最小 X 值。

我忍不住觉得应该有一个比我的解决方案更快的算法(这只是有用,因为我假设 99% 的时间我的点 P 已经是接近直线的点)。我尝试在谷歌上搜索更好的算法,但发现了很多不太合适的算法,以至于很难在所有混乱的不合适的算法中找到我正在寻找的东西。那么,这里有人有一个更有效的建议算法吗?请记住,我不需要完整的算法,因为我所拥有的算法可以满足我的需求,我只是好奇正确的解决方案是什么。

I'm asking this questions out of curiostity, since my quick and dirty implementation seems to be good enough. However I'm curious what a better implementation would be.

I have a graph of real world data. There are no duplicate X values and the X value increments at a consistant rate across the graph, but Y data is based off of real world output. I want to find the nearest point on the graph from an arbitrary given point P programmatically. I'm trying to find an efficient (ie fast) algorithm for doing this. I don't need the the exact closest point, I can settle for a point that is 'nearly' the closest point.

The obvious lazy solution is to increment through every single point in the graph, calculate the distance, and then find the minimum of the distance. This however could theoretically be slow for large graphs; too slow for what I want.

Since I only need an approximate closest point I imagine the ideal fastest equation would involve generating a best fit line and using that line to calculate where the point should be in real time; but that sounds like a potential mathematical headache I'm not about to take on.

My solution is a hack which works only because I assume my point P isn't arbitrary, namely I assume that P will usually be close to my graph line and when that happens I can cross out the distant X values from consideration. I calculating how close the point on the line that shares the X coordinate with P is and use the distance between that point and P to calculate the largest/smallest X value that could possible be closer points.

I can't help but feel there should be a faster algorithm then my solution (which is only useful because I assume 99% of the time my point P will be a point close to the line already). I tried googling for better algorithms but found so many algorithms that didn't quite fit that it was hard to find what I was looking for amongst all the clutter of inappropriate algorithms. So, does anyone here have a suggested algorithm that would be more efficient? Keep in mind I don't need a full algorithm since what I have works for my needs, I'm just curious what the proper solution would have been.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

避讳 2024-12-02 20:14:15

如果您将 [x,y] 点存储在 四叉树 中,您将能够找到快速最接近的一个(类似于 O(log n))。我认为这是你能做的最好的事情,而不用假设重点在哪里。不要在此处重复该算法,请查看此链接

您的解决方案非常好,通过检查 y 中的点如何变化,您是否无法计算需要检查的沿 x 轴的点数量界限,而不是使用任意点。

If you store the [x,y] points in a quadtree you'll be able to find the closest one quickly (something like O(log n)). I think that's the best you can do without making assumptions about where the point is going to be. Rather than repeat the algorithm here have a look at this link.

Your solution is pretty good, by examining how the points vary in y couldn't you calculate a bound for the number of points along the x axis you need to examine instead of using an arbitrary one.

小帐篷 2024-12-02 20:14:15

假设您的观点 P=(x,y) 且您的实际数据是一个函数 y=f(x)

第 1 步:计算 r=| f(x)-y|

步骤 2:查找区间 I=(xr,x+r) 中的点

步骤 3:查找 I 中与 P 最接近的点。

Let's say your point P=(x,y) and your real-world data is a function y=f(x)

Step 1: Calculate r=|f(x)-y|.

Step 2: Find points in the interval I=(x-r,x+r)

Step 3: Find the closest point in I to P.

帅哥哥的热头脑 2024-12-02 20:14:15

如果您可以使用数据结构,那么用于空间搜索(包括最近邻)的一些常见数据结构是......

  • 四叉树(和八叉树等)。
  • kd-tree
  • bsp 树(仅适用于静态点集)。
  • r 树

r 树有多种变体。它与 B+ 树非常密切相关,但叶节点中的项目(点)具有不同的排序(取决于变体)。

Hilbert R 树基于 Hilbert 曲线使用严格的点排序。希尔伯特曲线(或者更确切地说是它的概括)非常擅长对多维数据进行排序,因此空间中邻近的点通常在线性排序中是邻近的。

原则上,希尔伯特排序可以通过对简单的点数组进行排序来应用。其中的自然聚类意味着搜索通常只需要搜索数组中的几个相当短的跨度 - 复杂的是您需要计算出它们是哪些跨度。

我曾经有一个关于进行希尔伯特曲线排序计算的好论文的链接,但我把它弄丢了。基于格雷码的排序会更简单,但在聚类方面不太有效。事实上,格雷码和希尔伯特曲线之间存在着深刻的联系——我丢失的那篇论文大量使用了格雷码相关的函数。

编辑 - 我发现该链接 - http ://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.133.7490

If you can use a data structure, some common data structures for spacial searching (including nearest neighbour) are...

  • quad-tree (and octree etc).
  • kd-tree
  • bsp tree (only practical for a static set of points).
  • r-tree

The r-tree comes in a number of variants. It's very closely related to the B+ tree, but with (depending on the variant) different orderings on the items (points) in the leaf nodes.

The Hilbert R tree uses a strict ordering of points based on the Hilbert curve. The Hilbert curve (or rather a generalization of it) is very good at ordering multi-dimensional data so that nearby points in space are usually nearby in the linear ordering.

In principle, the Hilbert ordering could be applied by sorting a simple array of points. The natural clustering in this would mean that a search would usually only need to search a few fairly-short spans in the array - with the complication being that you need to work out which spans they are.

I used to have a link for a good paper on doing the Hilbert curve ordering calculations, but I've lost it. An ordering based on Gray codes would be simpler, but not quite as efficient at clustering. In fact, there's a deep connection between Gray codes and Hilbert curves - that paper I've lost uses Gray code related functions quite a bit.

EDIT - I found that link - http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.133.7490

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文