热图的地理数据聚类

发布于 2025-01-02 12:27:19 字数 152 浏览 1 评论 0原文

我有一个推文列表及其地理位置。 它们将显示在透明放置在 Google 地图上的热图图像中。 诀窍是找到彼此相邻的位置组并显示 它们是基于簇大小的特定热度/颜色的单个热图圆圈/图形。

是否有一些库准备将地图中的位置分组为集群? 或者我最好应该决定我的聚类参数并构建自定义算法?

I have a list of tweets with their geo locations.
They are going to be displayed in a heatmap image transparently placed over Google Map.
The trick is to find groups of locations residing next to each other and display
them as a single heatmap circle/figure of a certain heat/color, based on cluster size.

Is there some library ready to grouping locations in a map into clusters?
Or I better should decide my clusterization params and build a custom algorithm?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

若言繁花未落 2025-01-09 12:27:19

我不知道是否有一个“准备将地图中的位置分组为簇的库”,也许是,也许不是。无论如何,我不建议您构建自定义聚类算法,因为已经为此实现了很多库。

@recursive 向您发送了一个链接,其中包含 k-means(一种聚类算法)的 php 代码。还有一个包含其他技术的巨大 Java 库(Java-ML),包括 k-means、分层聚类、k-means++(选择质心)等。

最后我想告诉您,聚类是一种无监督算法,这意味着它实际上会为您提供一组内部包含数据的聚类,但是第一眼你不知道算法如何对您的数据进行聚类。我的意思是,它可能按照您想要的位置进行聚类,但它也可以按照您不需要的另一个特征进行聚类,因此这一切都是关于使用算法的参数并调整您的解决方案。

我对你能找到的这个问题的最终解决方案感兴趣:)也许你可以在结束这个项目时在评论中分享它!

I don't know if there is a 'library ready to grouping locations in a map into clusters', maybe it is, maybe it isn't. Anyways, I don't recommend you to build your custom clustering algorithm since there are a lot of libraries already implemented for this.

@recursive sent you a link with a php code for k-means (one clustering algorithm). There is also a huge Java library with other techniques (Java-ML) including k-means too, hierarchical clustering, k-means++ (to select the centroids), etc.

Finally I'd like to tell you that clustering is a non-supervised algorithm, which means that effectively, it will give you a set of clusters with data inside them, but at a first glance you don't know how the algorithm clustered your data. I mean, it may be clustered by locations as you want, but it can be clustered also by another characteristic you don't need so it's all about playing with the parameters of the algorithm and tune your solutions.

I'm interested in the final solution you could find to this problem :) Maybe you can share it in a comment when you end this project!

眉目亦如画i 2025-01-09 12:27:19

K 均值聚类是一种经常用于解决此类问题的技术

基本思想是这样的:

给定一组初始 k 表示 m1,…,mk
算法通过在两个步骤之间交替进行:

  1. 分配步骤:将每个观测值分配给具有最接近均值的聚类

  2. 更新步骤:计算新均值作为聚类中观测值的质心。

这里是 php 的一些示例代码。

K means clustering is a technique often used for such problems

The basic idea is this:

Given an initial set of k means m1,…,mk, the
algorithm proceeds by alternating between two steps:

  1. Assignment step: Assign each observation to the cluster with the closest mean

  2. Update step: Calculate the new means to be the centroid of the observations in the cluster.

Here is some sample code for php.

夜光 2025-01-09 12:27:19

heatmap.js 是一个用于渲染热图的 HTML5 库,并且有一个示例在 Google 地图 API 之上。它非常强大,但仅适用于支持画布的浏览器:

heatmap.js 库目前在 Firefox 3.6+、Chrome 中受支持
10、Safari 5、Opera 11 和 IE 9+。

heatmap.js is an HTML5 library for rendering heatmaps, and has a sample for doing it on top of the Google Maps API. It's pretty robust, but only works in browsers that support canvas:

The heatmap.js library is currently supported in Firefox 3.6+, Chrome
10, Safari 5, Opera 11 and IE 9+.

明媚殇 2025-01-09 12:27:19

您可以在 phpclasses.org 上尝试我的 php 类希尔伯特曲线。这是一条巨大的曲线,将 2d 复杂性降低到 1d 复杂性。我使用四键来定位坐标,它有 21 个缩放级别,就像 Google 地图一样。

You can try my php class hilbert curve at phpclasses.org. It's a monster curve and reduces 2d complexity to 1d complexity. I use a quadkey to address a coordinate and it has 21 zoom levels like Google maps.

最近可好 2025-01-09 12:27:19

这实际上并不是一个聚类问题。头部图不能通过创建簇来工作。相反,他们用高斯核对数据进行卷积。如果您不熟悉图像处理,请将其视为使用普通或高斯“标记”并将其标记在每个点上。由于图章的覆盖层将相互叠加,因此高密度区域将具有更高的值。

This isn't really a clustering problem. Head maps don't work by creating clusters. Instead they convolute the data with a gaussian kernel. If you're not familiar with image processing, think of it as using a normal or gaussian "stamp" and stamping it over each point. Since the overlays of the stamp will add up on top of each other, areas of high density will have higher values.

帥小哥 2025-01-09 12:27:19

热图的一种简单替代方法是将纬度/经度四舍五入到一些小数并按其分组。

请参阅此说明了解纬度/经度小数精度。

  • 1 位小数 - 11km
  • 2 位小数 - 1.1km
  • 3 位小数 - 110m

等等。

对于包含大量数据的低缩放级别热图,四舍五入到 1 或 2 位小数并按此对结果进行分组应该可以解决问题。

One simple alternative for heatmaps is to just round the lat/long to some decimals and group by that.

See this explanation about lat/long decimal accuracy.

  • 1 decimal - 11km
  • 2 decimals - 1.1km
  • 3 decimals - 110m

etc.

For a low zoom level heatmap with lots of data, rounding to 1 or 2 decimals and grouping the results by that should do the trick.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文