如何在Oracle中有效计算坐标之间的距离
我有一个大型 Oracle 数据库(大约 720,000 条记录),其中每个记录都有自己的地理坐标(纬度和经度),我需要仅选择距某个点特定距离(特定半径内)的记录。
目前,我已经实现了一个距离函数(基于半正矢),这是我在 Oracle 论坛中找到的,但由于数据库有点大,因此每次选择大约花费 50 秒。
关于如何有效地做到这一点有什么建议吗?我知道有一个名为 oracle space & 的扩展。定位器,但我不知道是否可以购买它,甚至不知道它是如何工作的。预先非常感谢。此致
I have a large Oracle database ( 720,000 records aprox) where each record has its own geographic coordinates (lat & lng) and i need to select just the records that are in a specific distance from a point ( inside a specific radius).
Currently i've implemented a distance function (based on haversine) that i've found in an oracle forum but because the database is a bit big it spends about 50 seconds per select.
Any recomendations on how to do thi efficiently?. I know there is an extension called oracle spatial & locator but i don´t know if i can buy it or even how does it work. Thanks a lot in advance. Best regards
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
使用更好的算法。您无需计算需要平方根计算的实际欧几里得距离,而是在仅需要减法和加法的线性距离上进行选择。即,如果您的点位于 (10, 10) 并且半径为 5,则选择由 (10 +/- 5, 10 +/- 5) 形成的正方形内有点的所有位置。
这将在正方形的角落捕获少量误报。通过计算正确的欧几里得距离来仔细检查应用程序中的结果,可以消除这些问题。
Use a better algorithm. Instead of calculating actual Euclidian distance, which requires a square-root calculation, do your select on the linear distance that requires only subtraction and addition. I.e. if your point is at (10, 10) and your radius is 5, select all places with points inside the square formed by (10 +/- 5, 10 +/- 5).
This will catch a small number of false positives in the corners of the square. Eliminate these by double-checking the results in your application by calculating the proper Euclidian distance.
请提供有关纬度和经度值的具体格式以及用于实现半正弦值的具体公式的更多详细信息。
有三种方法可以加快速度。根据具体情况,我们至少可以执行其中两项。
通过简单的属性值比较淘汰尽可能多的记录。
对于这些记录,我们根本不需要计算任何东西。
例如,将最大半径要求转换为经度(可能还有纬度)值的[慷慨但近似]范围,该范围将符合
使用替代(可能近似)距离测量条件的经度(可能还包括纬度)值的[宽大但近似]范围。
例如,基于四舍五入的坐标来计算欧氏距离的平方可能会更快。 (当然,要将其与所需半径的平方进行比较)
改进半正弦公式的实现方式。
Do provide more details about the specific format of the Lat and Long values, as well as the specific formula used for implementing haversine.
There are three approaches which can speed up things. Depending on the situation we can do at least two of these.
Weed-out as many records as possible by a simple attribute value comparaison.
For these records, we don't need to calculate anything at all.
For example, convert the maximum radius requirement to a [generous but approximate] range of the Longitude (and possibly latitude) values which would qualify
Use an alternative (possibly approximative) distance measurement.
For example, it may be faster to calculate the square of the eucldidian distance, based on a rounded-up coordinates. (And of course to compare this with the square of desired radius)
Improve the way the haversine formula is implemented.
一些建议,如果您还没有这样做的话...
由于半正弦计算需要以弧度为单位的角度,如果您以度数存储纬度和经度,请添加几列并预先计算弧度当量。更一般地说,预先计算函数中可以用于公式的任何值并存储它们。
考虑使用更简单的函数来消除半径之外的点,仅对基于更简单函数的潜在匹配点运行半正弦函数。对于度数,您可以使用 SQRT( (69.1*dLat)2 + (53*dLong)2) ) 并使用一些模糊因子 (10%)。如果您需要比更简单的计算提供的更好的结果,则仅在与粗略近似匹配的点上运行半正弦计算。
A couple of suggestions, if you aren't already doing them...
Since the Haversine calculation requires the angles in radians, if you are storing latitude and longitude in degrees, add a couple of columns and precompute the radian equivalents. More generally, pre-compute any of the values in the function that you can for the formula and store them.
Consider using a simpler function to eliminate points that are well outside the radius, running the Haversine function only on those that are potential matches based on the simpler function. For degrees you could use SQRT( (69.1*dLat)2 + (53*dLong)2) ) and use some fudge factor (10%). Run your Haversine calculation only on the points that match the cruder approximation if you need better than what the simpler calculation provides.
如果您有许可证,则可以使用 Oracle Spatial
Oracle Docs - Oracle Spatial
我没有使用过它,但快速浏览文档会指向该函数 SDO_WITHIN_DISTANCE
If you have the license then Oracle Spatial might be of use
Oracle Docs - Oracle Spatial
I've not used it but a quick scan of the docs would point to the function SDO_WITHIN_DISTANCE
“特定距离”是否有些恒定?即,您总是搜索“1 英里内的所有点”还是半径会变化?
您希望在任何给定查询中返回总记录的百分比是多少? 10%? .10%?
如果半径始终相同,请构建一个长度与半径相同的正方形网格。为每个人分配一个相邻方块的列表。每个点都会知道它在哪个方格中,从中您可以获得所有相邻方格的列表。然后仅对这些正方形中的点运行计算。这与弹出的其他答案类似,但会更快,因为线性计算是在索引查找中近似的,而不是在每个点之间计算。
即使半径可变,您仍然可以使用上述内容,但您必须计算要包含多少个“邻居”。仅当您希望从任何单个查询中获取总数的一小部分时,这些方法才可行。
Is the "specific distance" somewhat constant? IE are you always searching for "all points within 1 mile" or does the radius change?
What percentage of the total records do you expect to get back in any given query? 10%? .10%?
If you will always have the same radius, build a grid of squares with the same length as the radius. Assign each a list of neighboring squares. Each point will know what square it is in, from which you can get a list of all the neighboring squares. Then run the calculation on only the points in those squares. This is similar to the other answers that have popped up, but will be quicker because the linear calculations are approximated in an indexed lookup rather than calculated between every point.
Even with a variable radius, you can still use the above, but you'll have to calculate how many 'neighbors' to include. These are only feasible if you're expecting to get a small subset of the total from any individual query.
如果你不需要距离太精确,你可以把地球当作平坦的。来自此讨论:
我最近对 mysql 进行了一些优化(此处概述:www.mooreds.com/wordpress/archives/000547 [抱歉,我每篇文章只得到 1 个超链接])但不确定有多少步骤我查过的都是适用于Oracle的。有些肯定是(比如如果可能的话使用边界框)。
If you don't need the distance to be too accurate, you can just treat the earth as flat. From this discussion:
I recently did some optimizing for mysql (outlined here: www.mooreds.com/wordpress/archives/000547 [sorry, I only get 1 hyperlink per post] ) but am not sure how many of the steps I went through are applicable to Oracle. Some definitely are (like using a bounding box if possible).
如果您更改 53.0 幻数...并考虑纬度的变化,您可以获得更准确的结果。 (当你向两极移动时,会逐渐变小。)
有人有那个神奇的神奇公式吗?
You can get a much more accurate result... if you change the 53.0 magic number... to also take into account the change in latitude. (Get progressively smaller as you move toward the poles.)
Does anyone have that magic-magic formula?
首先,半正矢并不完美,因为地球不是一个完美的球体 - 请阅读 http://www.movable-type.co.uk/scripts/latlong-vincenty.html
其次 - PL/SQL 并不是一个完美的工具来编写需要多次调用的多行代码的计算程序。如果您使用 Java 或 C++ 来实现您的数学,您将获得巨大的性能提升。 C++ 或 Java 代码可以像函数一样从 Oracle 中调用。
第三-那些评论说你需要用简单的矩形框切掉尽可能多的点的人是非常正确的。按经度和纬度列创建索引,这将有助于执行该装箱子句。
最后,我认为 Oracle Spatial 不必参与其中 - 这是一种矫枉过正的行为。如果您已经拥有它并创建了 SDO_GEOMETRY 列,那么这是一个故事,但如果没有 - 我不会考虑它。
First of all, Haversine is not perfect, because Earth is not a perfect sphere - read http://www.movable-type.co.uk/scripts/latlong-vincenty.html
Second - PL/SQL is not a perfect tool to program calculations with many lines of code which will be called many times. If you go with Java or C++ implementing your math, you will give huge performance improvement. C++ or Java code can be called from Oracle just like a function.
Third - people who commented that you need to cut out as many points as possible with simple rectangular boxing are very correct. Create an index by longitude and latitude columns, it will help to execute that boxing clause.
Lastly, I don't think Oracle Spatial has to be involved here - it is an overkill. If you already have it and created SDO_GEOMETRY column, this is one story, but if not - I would not consider it.