我正在构建一个应用程序,从数据库中提取纬度/经度值并将它们绘制在谷歌地图上。可能有数千个数据点,因此我将点“聚集”在一起,这样用户就不会被图标淹没。目前,我在应用程序中执行此聚类,使用如下简单算法:
- 获取所有点的数组
- 将第一个点从数组中弹出 将
- 第一个点与数组中的所有其他点进行比较,查找落在 x 距离内的点
- 创建一个簇原始点和接近点。
- 从数组中删除接近点
- 现在重复
我发布这是低效的,这也是我一直在研究 GIS 系统的原因。我已经设置了 PostGIS 并拥有我的纬度和纬度。长整型存储在 POINT 几何对象中。
有人可以帮助我开始或为我提供一些关于在 PostGIS 中简单实现此聚类算法的资源吗?
I'm building an application that pulls lat/long values from a database and plots them on a Google Map. There could be thousands of data points so I "cluster" points close to each other so the user is not overwhelmed with icons. At the moment I perform this clustering in the application, with a simple algorithm like this:
- Get array of all points
- Pop first point off array
- Compare first point to all other points in array looking for ones that fall within x distance
- Create a cluster with the original and close points.
- Remove close points from array
- Repeat
Now I release this is inefficient and is the reason I have been looking into GIS systems. I have set up PostGIS and have my lat & longs stored in a POINT geometry object.
Can someone get me started or point me to some resources on a simple implementation of this clustering algorithm in PostGIS?
发布评论
评论(3)
我最终使用了 snaptogrid 和 平均值。我意识到有一些算法(即丹尼斯建议的 kmeans)可以为我提供更好的集群,但对于我正在做的事情来说,这是足够快和准确的。
I ended up using a combination of snaptogrid and avg. I realize there are algorithms out there (i.e. kmeans as Denis suggested) that will give me better clusters but for what I'm doing this is fast and accurate enough.
如果浏览器中的内容集群就足够了,那么您可以轻松利用 OpenLayer 的集群功能。有 3 个示例显示聚类。
我之前曾将它与 PostGIS 数据库一起使用,只要你没有大量的数据,它就可以非常顺利地工作。
If it's enough to have stuff clustered in your browser, you could easily make use of OpenLayer's clustering capabilities. There are 3 examples that show clustering.
I've used it with a PostGIS database before, and as long as you don't have ridiculous amounts of data, it works pretty smooth.
使用 PostGIS 对
lonlat
点(st_point
类型)进行聚类的示例。结果集将包含 (cluster_id, id) 对。簇数是传递给ST_ClusterKMeans
的参数。我们需要带有
COUNT
窗口函数的公共表表达式,以确保提供给ST_ClusterKMeans
的簇数永远不会低于输入行数。An example of clustering
lonlat
points (ofst_point
type) with PostGIS. The result set will contain (cluster_id, id) pairs. The number of clusters is the argument passed toST_ClusterKMeans
.We need the Common Table Expression with a
COUNT
window function in order to make sure the number of clusters provided toST_ClusterKMeans
never goes below the number of input rows.