查询数据库以查找邮政编码 n 英里范围内的记录的最佳方法是什么?

发布于 2024-07-12 23:01:28 字数 754 浏览 7 评论 0原文

我的数据库中有一个记录列表,每条记录都与一个邮政编码相关联。

查询数据库中的所有记录以查找距另一个邮政编码 n 英里以内的所有条目的“最佳实践”是什么?

每个邮政编码在数据库中都有一个与之关联的纬度/经度,所以我知道我必须使用它。 但是,我无法想象对每对邮政编码运行任何类型的距离公式,转换为英里并拒绝那些不在我的半径范围内的距离公式。

对于这样一个常见的查询来说,这在计算上似乎非常昂贵。

我也考虑过进行全对预计算,但它似乎太大了,无法考虑。 美国大约有 40,000 个邮政编码。 因此,每个邮政编码的所有对数据库将为 (40,000)^2,即 16 亿个条目。

我知道这是网站上的一个常见问题,因此希望有人能够为我指出正确的方向以找到最佳方法。 我正在使用SQL Server 2008,如果有预先构建的解决方案那就太好了,因为在这种情况下我真的不想重新发明轮子


相关问题:获取半径内的所有邮政编码(这对我没有帮助)
另外,我知道这个 SourceForge 项目,但它已被废弃并且不再使用。

I have a list of records in my database and each record is associated with a zip code.

What is the "best-practice" for querying all the records in my database to find all entries that are within n miles of another zip code?

Each zip code has a lat/long associated with it in the database so I know I'll have to use that. However, I can't imagine running any sort of distance formula on each pair of zip codes, converting to miles and rejecting those that aren't within my radius.

That seems awfully computationally expensive for such a common query.

I've also considered doing an all-pairs pre-computation but it seems too large to consider also. There are approximately ~40,000 zip codes in the US. So, an all pairs database of each zip code would be (40,000)^2, or 1.6billion entries.

I know this is a common problem on websites so hopefully someone can point me in the right direction for the best way. I'm using SQL Server 2008 and if there are pre-built solutions out there then great, because I really don't want to re-invent the wheel in this instance.


Related Question: Getting all zip codes within radius (this didn't help me)
Also, I know of this SourceForge project but it is derelict and no longer in use.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

蹲墙角沉默 2024-07-19 23:01:28

我将运行一个查询,返回包含在径向搜索圆的方形信封中的所有记录(minlat < lat < maxlat 和 minlong < long < maxlong),然后对其进行后处理以仅返回半径圆本身。
(确保您的纬度和经度字段已建立索引)。

如果您想体验一下,SQL Server 支持空间索引

I would run a query that returned all records bracketed in the square envelope encompasing the radial search circle (minlat < lat < maxlat and minlong < long < maxlong), and then post-process this to return only the points within the radius circle itself.
(Make sure your lat and long fields are indexed).

If you wanted to get fancy, SQL server supports spatial indexes.

偷得浮生 2024-07-19 23:01:28

我运行 一个需要每个用户每秒运行一次此查询的网站,以下是我了解到:

首先,确保您的位置表具有纬度和经度索引。 如果您有数百万条记录,这就是 20 毫秒和 15 秒响应时间之间的差异。

从边界框查询开始,获取一组要使用的位置。 然后计算这些距离,进行排序,如果您对准确性很挑剔,请过滤掉一些。

坦率地说,我不担心预先计算任何东西。 就像我说的,我对包含 6,000,000 个条目的位置表运行这种类型的查询,它通常会在 <50 毫秒内返回结果。 根据您的需求,这确实足够快。

祝你好运!

I run a site that needs to run this query about once per second per user, and here's what I've learned:

First off, make sure your location table has indexes on Lat and Lon. That's the difference between 20ms and 15s response times if you have millions of records.

Start off with a bounding-box query to get a set of locations to work with. Then calculate distances on those, sort, and if you're fussy about accuracy, filter a few out.

Frankly, I wouldn't worry about pre-computing anything. Like I say, I run this type of query against a location table with 6,000,000 entries, and it usually returns results in <50ms. Depending on your needs, that really aught to be fast enough.

Good luck!

染火枫林 2024-07-19 23:01:28

这其实是一个非常难解决的问题。 我建议您通过预先创建数据库来进行一些作弊。 创建一个您需要找到的任何接近程度的网格,例如,在每个方向上每隔 10 英里,为该网格点和距离的每个邮政编码添加一个条目到数据库中,然后当查询到来时,您可以首先将查询点转换为网格点之一。 现在您可以轻松地查看距离。

该解决方案基本上意味着用空间换时间,因此您可以快速获得相当大的数据库。 好消息是:索引数据非常容易。

This is in fact a very hard problem to solve. I would recommend you do some cheating by pre-creating a database. Create a grid of whatever kind of closeness you need to find, for example, take every 10 miles in each direction, add an entry to the database for each zip for that grid point and the distance, and then when a query comes in, you first translate the query point to one of your grid points. Now you can look up the distance quite easily.

This solution basically means trading space for time, so you can get a quite large database quickly. The good news is: it is very easy data to index.

如果没有 2024-07-19 23:01:28

您应该查看 GeoNames.org。 您可以查询他们的网络服务以获取您要查找的内容,或者您​​也可以他们的数据库。

You should look at GeoNames.org. You can query theirwebservice for what you are looking for, or you can dl thier database.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文