PostGIS 中的聚类点

发布于 2024-11-16 17:35:25 字数 391 浏览 1 评论 0 原文

我正在构建一个应用程序,从数据库中提取纬度/经度值并将它们绘制在谷歌地图上。可能有数千个数据点,因此我将点“聚集”在一起,这样用户就不会被图标淹没。目前,我在应用程序中执行此聚类,使用如下简单算法:

  1. 获取所有点的数组
  2. 将第一个点从数组中弹出 将
  3. 第一个点与数组中的所有其他点进行比较,查找落在 x 距离内的点
  4. 创建一个簇原始点和接近点。
  5. 从数组中删除接近点
  6. 现在重复

我发布这是低效的,这也是我一直在研究 GIS 系统的原因。我已经设置了 PostGIS 并拥有我的纬度和纬度。长整型存储在 POINT 几何对象中。

有人可以帮助我开始或为我提供一些关于在 PostGIS 中简单实现此聚类算法的资源吗?

I'm building an application that pulls lat/long values from a database and plots them on a Google Map. There could be thousands of data points so I "cluster" points close to each other so the user is not overwhelmed with icons. At the moment I perform this clustering in the application, with a simple algorithm like this:

  1. Get array of all points
  2. Pop first point off array
  3. Compare first point to all other points in array looking for ones that fall within x distance
  4. Create a cluster with the original and close points.
  5. Remove close points from array
  6. Repeat

Now I release this is inefficient and is the reason I have been looking into GIS systems. I have set up PostGIS and have my lat & longs stored in a POINT geometry object.

Can someone get me started or point me to some resources on a simple implementation of this clustering algorithm in PostGIS?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

撩动你心 2024-11-23 17:35:25

我最终使用了 snaptogrid平均值。我意识到有一些算法(即丹尼斯建议的 kmeans)可以为我提供更好的集群,但对于我正在做的事情来说,这是足够快和准确的。

I ended up using a combination of snaptogrid and avg. I realize there are algorithms out there (i.e. kmeans as Denis suggested) that will give me better clusters but for what I'm doing this is fast and accurate enough.

哽咽笑 2024-11-23 17:35:25

如果浏览器中的内容集群就足够了,那么您可以轻松利用 OpenLayer 的集群功能。有 3 个示例显示聚类。

我之前曾将它与 PostGIS 数据库一起使用,只要你没有大量的数据,它就可以非常顺利地工作。

If it's enough to have stuff clustered in your browser, you could easily make use of OpenLayer's clustering capabilities. There are 3 examples that show clustering.

I've used it with a PostGIS database before, and as long as you don't have ridiculous amounts of data, it works pretty smooth.

寄离 2024-11-23 17:35:25

使用 PostGIS 对 lonlat 点(st_point 类型)进行聚类的示例。结果集将包含 (cluster_id, id) 对。簇数是传递给 ST_ClusterKMeans 的参数。

WITH sparse_places AS (
  SELECT
    lonlat, id, COUNT(*) OVER() as count
  FROM places
) 
  SELECT
    sparse_places.id,
    ST_ClusterKMeans(lonlat::geometry, LEAST(count::integer, 10)) OVER() AS cid
  FROM sparse_places;

我们需要带有 COUNT 窗口函数的公共表表达式,以确保提供给 ST_ClusterKMeans 的簇数永远不会低于输入行数。

An example of clustering lonlat points (of st_point type) with PostGIS. The result set will contain (cluster_id, id) pairs. The number of clusters is the argument passed to ST_ClusterKMeans.

WITH sparse_places AS (
  SELECT
    lonlat, id, COUNT(*) OVER() as count
  FROM places
) 
  SELECT
    sparse_places.id,
    ST_ClusterKMeans(lonlat::geometry, LEAST(count::integer, 10)) OVER() AS cid
  FROM sparse_places;

We need the Common Table Expression with a COUNT window function in order to make sure the number of clusters provided to ST_ClusterKMeans never goes below the number of input rows.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文