SQL中按距离搜索

发布于 2024-12-01 15:00:17 字数 856 浏览 0 评论 0原文

我正在规划一个网站(Drupal/MySQL),它必须根据距某个位置的距离(我们从大约 20,000 个位置开始)搜索相当大的数据库。到目前为止,我发现以合理方式搜索的最佳解决方案是使用 SQL 中的用户定义函数来计算坐标之间的距离,例如:(

SELECT *, CoordinateDistanceMiles(lat, lon, ${inputLat}, ${inputLon}) as distance
FROM items WHERE distance < {$radius}

使用 John Dyer 的 距离函数 或类似的)

但是,我还了解到 UDF 效率非常低。我的第二个想法(也是暂定计划)是在这个查询中嵌套另一个查询以缩小其范围,从而在更小的项目子集上运行 UDF,例如:

SELECT *, CoordinateDistanceMiles(lat, lon, ${inputLat}, ${inputLon}) as distance
FROM (
    SELECT * FROM items WHERE
        lat BETWEEN ${inputLat - const} AND ${inputLat + const} AND
        lon BETWEEN ${inputLon - const} AND ${inputLon + const}
) WHERE distance < ${radius}

这个模型会使搜索更快,还是更复杂?还有更好的解决方案吗?

I am planning a website (Drupal/MySQL), which must search a fairly large database based on distance from a location (we're starting with ~20,000 locations). So far, the best solution I've found to searching in a reasonable manner is to use a user-defined function in SQL to calculate the distance between to coordinates, e.g.:

SELECT *, CoordinateDistanceMiles(lat, lon, ${inputLat}, ${inputLon}) as distance
FROM items WHERE distance < {$radius}

(Using John Dyer's distance function or similar)

However, I've also read that UDFs are very inefficient. My second idea (and tentative plan) is to nest another query inside this one to narrow its' scope and therefore run the UDF on a much smaller subset of items, e.g.:

SELECT *, CoordinateDistanceMiles(lat, lon, ${inputLat}, ${inputLon}) as distance
FROM (
    SELECT * FROM items WHERE
        lat BETWEEN ${inputLat - const} AND ${inputLat + const} AND
        lon BETWEEN ${inputLon - const} AND ${inputLon + const}
) WHERE distance < ${radius}

Would this model make the search faster, or just more convoluted? Are there any better solutions?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

两个我 2024-12-08 15:00:17

此处使用 UDF 的开销可以忽略不计,只要您在 distance distance 上执行扫描即可。 ${radius} 并有 2 个基于范围的比较(它们无法使用索引进行优化)。

因此,不必担心 UDF“效率低下”并使用它,因为它更具可读性。

The overhead of using UDF here is negligible, as long as you perform scan over distance < ${radius} and have 2 range-based comparisons (they cannot be optimized with indexes).

So don't worry about UDF "inefficiency" and use it, since it is much more readable.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文