如何扩展这个 SQL 查询来查找 k 个最近邻居?
我有一个充满二维数据的数据库 - 地图上的点。每条记录都有一个几何类型的字段。我需要做的是将一个点传递给一个存储过程,该存储过程返回k个最近的点(k也将传递给存储过程,但这很容易)。我在 http://blogs 中找到了一个查询。 msdn.com/isaac/archive/2008/10/23/nearest-neighbors.aspx 获取单个最近邻,但我不知道如何扩展它来查找k 最近的邻居。
这是当前查询 - T
是表格,g
是几何字段,@x
是要搜索的点,Numbers
是一个包含整数 1 到 n 的表:
DECLARE @start FLOAT = 1000;
WITH NearestPoints AS
(
SELECT TOP(1) WITH TIES *, T.g.STDistance(@x) AS dist
FROM Numbers JOIN T WITH(INDEX(spatial_index))
ON T.g.STDistance(@x) < @start*POWER(2,Numbers.n)
ORDER BY n
)
SELECT TOP(1) * FROM NearestPoints
ORDER BY n, dist
内部查询选择最近的非空区域,然后外部查询选择该区域的顶部结果;外部查询可以轻松更改为(例如)SELECT TOP(20)
,但如果最近的区域仅包含一个结果,则您将陷入困境。
我想我可能需要递归搜索包含 k 记录的第一个区域,但不使用表变量(这会导致维护问题,因为您必须创建表结构并且它很容易发生变化 - 那里有很多字段),我不明白如何。
I have a database full of two-dimensional data - points on a map. Each record has a field of the geometry type. What I need to be able to do is pass a point to a stored procedure which returns the k nearest points (k would also be passed to the sproc, but that's easy). I've found a query at http://blogs.msdn.com/isaac/archive/2008/10/23/nearest-neighbors.aspx which gets the single nearest neighbour, but I can't figure how to extend it to find the k nearest neighbours.
This is the current query - T
is the table, g
is the geometry field, @x
is the point to search around, Numbers
is a table with integers 1 to n:
DECLARE @start FLOAT = 1000;
WITH NearestPoints AS
(
SELECT TOP(1) WITH TIES *, T.g.STDistance(@x) AS dist
FROM Numbers JOIN T WITH(INDEX(spatial_index))
ON T.g.STDistance(@x) < @start*POWER(2,Numbers.n)
ORDER BY n
)
SELECT TOP(1) * FROM NearestPoints
ORDER BY n, dist
The inner query selects the nearest non-empty region and the outer query then selects the top result from that region; the outer query can easily be changed to (e.g.) SELECT TOP(20)
, but if the nearest region only contains one result, you're stuck with that.
I figure I probably need to recursively search for the first region containing k records, but without using a table variable (which would cause maintenance problems as you have to create the table structure and it's liable to change - there're lots of fields), I can't see how.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果从内部查询中删除
TOP (1) WITH TIES
,并将外部查询设置为返回前 k 行,会发生什么情况?我也有兴趣知道这项修正案是否有帮助。它应该比使用
TOP
更有效:注意 - 未经测试 - 我无法在这里访问 SQL 2008。
What happens if you remove
TOP (1) WITH TIES
from the inner query, and set the outer query to return the top k rows?I'd also be interested to know whether this amendment helps at all. It ought to be more efficient than using
TOP
:NB - untested - I don't have access to SQL 2008 here.
引自 Microsoft® SQL Server® 2008 内部:T-SQL 编程。第 14.8.4 节。
清单 2-1。创建和填充辅助数字表
Quoted from Inside Microsoft® SQL Server® 2008: T-SQL Programming. Section 14.8.4.
Listing 2-1. Creating and Populating Auxiliary Table of Numbers