多边形的点。如何将它们与给定的坐标保持空间匹配？

发布于 2025-02-01 10:22:05 字数 467 浏览 1 评论 0原文

我有一个Georefercted Flickr帖子的数据集（下图35k，图片），并且我有一个无关的地理参考多边形数据集（下图，大约40k，下图），当前都是Panda DataFrames。多边形不能覆盖可能会有可能的整个区域。我很难理解如何在许多不同的多边形中分类许多不同的要点（或检查它们是否接近）。最后，我想要一张映射，其中包括多边形colord的flickerdata到属性（tag）的点。我正在尝试在Python中这样做。您有任何想法或建议吗？

point dataframe polygon dataframe

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

小帐篷 2025-02-08 10:22:05

由于您没有任何示例数据可以加载和玩游戏，因此我的答案本质上是描述性的，试图解释一些可能解决您要解决的问题的可能策略。

我认为：

这些多边形可能是一些地址，您本质上想将地理位置的Flickr柱放在多边形中最接近的最佳匹配中。

首先，您需要确定或获取有关这些Flickr地理位置的精度的信息。由于许多错误来源（这些错误背后的原因不是您的关心，但错误的量）可能会有多不利）。这将使您有一个混乱（2D）或更可能是混乱（3D）的圈子的想法。为什么3D？好吧，您可能会在高层公寓上的某个高程中发布Flickr，因此，（x：latitude，y：经度，z：aptitude）可能需要考虑所有需要考虑。但是，您必须研究数据和任何其他信息，以确定此处的最佳选择（2D/3D散布空间）。

一旦您弄清楚了连接空间的类型，您将需要一个距离度量（通常仅在两个分之间距离） - 调用此sigma。只是为了安全起见，在1 Sigma的半径内找到所有地址（Geopopoygons），然后在2 Sigma中找到 - 这些是您可能的目标地址集。对于这些地址中的每个地址，都有一个变量，该变量可以计算其 centroid 的距离，以及其矩形外边界盒的四个角落。

然后，您将需要根据所有五个点的距离对每个Flickr地理位置的这些地址进行排名。您将需要一种识别远离大型建筑物中心远离的flickr点的方法（距离质心的距离可能超过距角的距离），但靠近其边缘与具有较小区域脚印的不同属性。

对于每个flickr点，因此，您将使用距离的多个概率（将基于距离度量的得分转换为概率）的多个预测，并使用其属于哪个多边形的距离进行多个预测。

因此，如果您选择任何Flickr位置，则应该能够显示Flickr位置可能属于的Top-K Geopopopopopopoongon（带有概率）。

对于可视化，我建议您使用holoviews dataShader ，因为这应该能够照顾数据中的维度诅咒。另外，请查看lafemap（或，geemap）。

参考

Holoviews： https://holoviews.org/
>： https://datashader.org/
laff> laff> laff> laff> laff> laff： https://leafmap.org/
geemap ： https://geemap.org/

Since, you don't have any sample data to load and play with, my answer will be descriptive in nature, trying to explain some possible strategies to approach the problem you are trying to solve.

I assume that:

these polygons are probably some addresses and you essentially want to place the geolocated flickr posts to the nearest best-match among the polygons.

First of all, you need to identify or acquire information on the precision of those flickr geolocations. How off could they possibly be because of numerous sources of errors (the reason behind those errors is not your concern, but the amount of error is). This will give you an idea of a circle of confusion (2D) or more likely a sphere of confusion (3D). Why 3D? Well, you might have flickr post from a certain elevation on a high-rise apartment, and so, (x: latitude,y: longitude, z: altitude) all may be necessary to consider. But, you have to study the data and any other information available to you to determine the best option here (2D/3D space-of-confusion).

Once you have figured out the type of ND-space-of-confusion, you will need a distance metric (typically just a distance between two points) -- call this sigma. Just to be on the safe side, find all the addresses (geopolygons) within a radius of 1 sigma and additionally within 2 sigma -- these are your possible set of target addresses. For each of these addresses have a variable that calculates its distances of its centroid, and the four corners of its rectangular outer bounding box from the flickr geolocations.

You will then want to rank these addresses for each flickr geolocation, based on their distances for all the five points. You will need a way of identifying a flickr point that is far from a big building's center (distance from centroid could be way more than distance from the corners) but closer to it's edges vs. a different property with smaller area-footprint.

For each flickr point, thus you would have multiple predictions with different probabilities (convert the distance metric based scores into probabilities) using the distances, on which polygon they belong to.

Thus, if you choose any flickr location, you should be able to show top-k geopolygons that flickr location could belong to (with probabilities).

For visualizations, I would suggest you to use holoviews with datashader as that should be able to take care of curse of dimension in your data. Also, please take a look at leafmap (or, geemap).