3g 覆盖图 - 可视化纬度、经度、ping 数据

发布于 2025-01-04 01:10:04 字数 2468 浏览 3 评论 0原文

假设我一直在笔记本电脑上使用 3g 调制解调器和 GPS 沿着既定路线行驶，而我家里的计算机记录了 ping 延迟。我已将 ping 与 GPS 纬度/经度相关联，现在我想可视化此数据。

我每天大约有 80,000 个数据点，我想显示几个月的数据。我特别感兴趣的是显示 ping 持续超时的区域（即 ping == 1000）。

散点图

我的第一次尝试是使用散点图，每个数据输入一个点。如果是超时，我会将点的大小放大 5 倍，因此这些区域的位置很明显。我还将 alpha 值降低到 0.1，以粗略的方式查看重叠点。

# Colour
c = pings 
# Size
s = [2 if ping < 1000 else 10 for ping in pings]
# Scatter plot
plt.scatter(longs, lats, s=s, marker='o', c=c, cmap=cm.jet, edgecolors='none', alpha=0.1)

散点图

这样做的明显问题是它为每个数据点显示一个标记，这是一种非常糟糕的显示大数据的方式数据量。如果我开车经过同一区域两次，那么第一遍数据只会显示在第二遍之上。

在偶数网格上进行插值

然后我尝试使用 numpy 和 scipy 在偶数网格上进行插值。

# Convert python list to np arrays
x = np.array(longs, dtype=float)
y = np.array(lats, dtype=float)
z = np.array(pings, dtype=float)

# Make even grid (200 rows/cols)
xi = np.linspace(min(longs), max(longs), 200)
yi = np.linspace(min(lats), max(lats), 200)

# Interpolate data points to grid
zi = griddata((x, y), z, (xi[None,:], yi[:,None]), method='linear', fill_value=0)

# Plot contour map
plt.contour(xi,yi,zi,15,linewidths=0.5,colors='k')
plt.contourf(xi,yi,zi,15,cmap=plt.cm.jet)

来自这个示例

这看起来很有趣（很多颜色和形状），但它也推断远在我没有探索过的地方。你看不到我走过的路线，只能看到红色/蓝色的斑点。

如果我在大弯道上行驶，它将对之间的区域进行插值（见下文）：

插值问题

< strong>在不均匀的网格上插值

然后我尝试使用 meshgrid (xi, yi = np.meshgrid(lats, longs)) 而不是固定网格，但我告诉我的数组太大。

有没有一种简单的方法可以根据我的点创建网格？

我的要求：

处理大型数据集（80,000 x 60 = ~5m 点）
通过平均（我假设插值可以做到这一点）或通过为每个点取最小值来显示每个点的重复数据。
不要从数据点推断太远，

我对散点图（顶部）很满意，但在显示数据之前我需要某种方法来平均数据。

（对不可靠的 mspaint 绘图表示歉意，我无法上传实际数据）

解决方案：

# Get sum
hsum, long_range, lat_range = np.histogram2d(longs, lats, bins=(res_long,res_lat), range=((a,b),(c,d)), weights=pings)
# Get count
hcount, ignore1, ignore2 = np.histogram2d(longs, lats, bins=(res_long,res_lat), range=((a,b),(c,d)))
# Get average
h = hsum/hcount
x, y = np.where(h)
average = h[x, y]
# Make scatter plot
scatterplot = ax.scatter(long_range[x], lat_range[y], s=3, c=average, linewidths=0, cmap="jet", vmin=0, vmax=1000)

原文

Suppose I've been driving a set route with a 3g modem and GPS on my laptop, while my computer back at home records the ping delay. I've correlated ping with GPS lat/long, and now I'd like to visualise this data.

I've got about 80,000 points of data per day, and I'd like to display several month's worth. I'm especially interested in displaying areas where ping consistently times out (ie ping == 1000).

Scatter plot

My first attempt was with a scatter plot, with one point per data entry. I made the size of the point 5x larger if it was a timeout, so it was obvious where these areas were. I also dropped the alpha to 0.1, for a crude way to see overlaid points.

# Colour
c = pings 
# Size
s = [2 if ping < 1000 else 10 for ping in pings]
# Scatter plot
plt.scatter(longs, lats, s=s, marker='o', c=c, cmap=cm.jet, edgecolors='none', alpha=0.1)

Scatter plot

The obvious problem with this is that it displays one marker per data point, which is a very poor way to display large amounts of data. If I've drive past the same area twice, then the first pass data is just displayed on top of the second pass.

Interpolate over an even grid

I then had a try at using numpy and scipy to interpolate over an even grid.

# Convert python list to np arrays
x = np.array(longs, dtype=float)
y = np.array(lats, dtype=float)
z = np.array(pings, dtype=float)

# Make even grid (200 rows/cols)
xi = np.linspace(min(longs), max(longs), 200)
yi = np.linspace(min(lats), max(lats), 200)

# Interpolate data points to grid
zi = griddata((x, y), z, (xi[None,:], yi[:,None]), method='linear', fill_value=0)

# Plot contour map
plt.contour(xi,yi,zi,15,linewidths=0.5,colors='k')
plt.contourf(xi,yi,zi,15,cmap=plt.cm.jet)

From this example

This looks interesting (lots of colours and shapes), but it extrapolates too far around areas I haven't explored. You can't see the routes I've travelled, just red/blue blotches.

If I've driven in a large curve, it'll interpolate for the area between (see below):

Interpolation problems

Interpolate over an uneven grid

I then had a try at using meshgrid (xi, yi = np.meshgrid(lats, longs)) instead of a fixed grid, but I'm told my array is too big.

Is there an easy way I can create a grid from my points?

My requirements:

Handle large data sets (80,000 x 60 = ~5m points)
Display duplicate data for each point either by averaging (I assume interpolation will do this), or by taking a minimum value for each point.
Don't extrapolate too far from data points

I'm happy with a scatter plot (top), but I need some way to average the data before I display it.

(Apologies for the dodgy mspaint drawings, I can't upload actual data)

Solution:

# Get sum
hsum, long_range, lat_range = np.histogram2d(longs, lats, bins=(res_long,res_lat), range=((a,b),(c,d)), weights=pings)
# Get count
hcount, ignore1, ignore2 = np.histogram2d(longs, lats, bins=(res_long,res_lat), range=((a,b),(c,d)))
# Get average
h = hsum/hcount
x, y = np.where(h)
average = h[x, y]
# Make scatter plot
scatterplot = ax.scatter(long_range[x], lat_range[y], s=3, c=average, linewidths=0, cmap="jet", vmin=0, vmax=1000)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

阪姬 2025-01-11 01:10:04

为了简化您的问题，您有两组点，一组用于 ping<1000，一组用于 ping>=1000。
由于点数非常多，因此无法直接通过 scatter() 绘制它们。我通过以下方式创建了一些样本数据：

longs = (np.random.rand(60, 1) + np.linspace(-np.pi, np.pi, 80000)).reshape(-1)
lats = np.sin(longs) + np.random.rand(len(longs)) * 0.1

bad_index = (longs>0) & (longs<1)
bad_longs = longs[bad_index]
bad_lats = lats[bad_index]

(longs, lats) 是 ping<1000 的点，(bad_longs, bad_lats) 是 ping>1000 的点

您可以使用 numpy.histogram2d() 来计算点：

ranges = [[np.min(lats), np.max(lats)], [np.min(longs), np.max(longs)]]
h, lat_range, long_range = np.histogram2d(lats, longs, bins=(400,400), range=ranges)
bad_h, lat_range2, long_range2 = np.histogram2d(bad_lats, bad_longs, bins=(400,400), range=ranges)

h 和 bad_h 是点计数在每一个小广场区域。

然后你可以选择多种方法来可视化它。例如，您可以通过 scatter() 绘制它：

y, x = np.where(h)
count = h[y, x]
pl.scatter(long_range[x], lat_range[y], s=count/20, c=count, linewidths=0, cmap="Blues")

count = bad_h[y, x]
pl.scatter(long_range2[x], lat_range2[y], s=count/20, c=count, linewidths=0, cmap="Reds")

pl.show()

这是完整的代码：

import numpy as np
import pylab as pl

longs = (np.random.rand(60, 1) + np.linspace(-np.pi, np.pi, 80000)).reshape(-1)
lats = np.sin(longs) + np.random.rand(len(longs)) * 0.1

bad_index = (longs>0) & (longs<1)
bad_longs = longs[bad_index]
bad_lats = lats[bad_index]

ranges = [[np.min(lats), np.max(lats)], [np.min(longs), np.max(longs)]]
h, lat_range, long_range = np.histogram2d(lats, longs, bins=(300,300), range=ranges)
bad_h, lat_range2, long_range2 = np.histogram2d(bad_lats, bad_longs, bins=(300,300), range=ranges)

y, x = np.where(h)
count = h[y, x]
pl.scatter(long_range[x], lat_range[y], s=count/20, c=count, linewidths=0, cmap="Blues")

count = bad_h[y, x]
pl.scatter(long_range2[x], lat_range2[y], s=count/20, c=count, linewidths=0, cmap="Reds")

pl.show()

输出图为：

在此处输入图像描述

To simplify your question, you have two set of points, one for ping<1000, one for ping>=1000.
Since the count of points is very large, you can't plot them directly by scatter(). I created some sample data by:

longs = (np.random.rand(60, 1) + np.linspace(-np.pi, np.pi, 80000)).reshape(-1)
lats = np.sin(longs) + np.random.rand(len(longs)) * 0.1

bad_index = (longs>0) & (longs<1)
bad_longs = longs[bad_index]
bad_lats = lats[bad_index]

(longs, lats) is points for ping<1000, (bad_longs, bad_lats) is points for ping>1000

You can use numpy.histogram2d() to count the points:

ranges = [[np.min(lats), np.max(lats)], [np.min(longs), np.max(longs)]]
h, lat_range, long_range = np.histogram2d(lats, longs, bins=(400,400), range=ranges)
bad_h, lat_range2, long_range2 = np.histogram2d(bad_lats, bad_longs, bins=(400,400), range=ranges)

h and bad_h are the points count in every little squere area.

Then you can choose many methods to visualize it. For example, you can plot it by scatter():

y, x = np.where(h)
count = h[y, x]
pl.scatter(long_range[x], lat_range[y], s=count/20, c=count, linewidths=0, cmap="Blues")

count = bad_h[y, x]
pl.scatter(long_range2[x], lat_range2[y], s=count/20, c=count, linewidths=0, cmap="Reds")

pl.show()

Here is the full code:

import numpy as np
import pylab as pl

longs = (np.random.rand(60, 1) + np.linspace(-np.pi, np.pi, 80000)).reshape(-1)
lats = np.sin(longs) + np.random.rand(len(longs)) * 0.1

bad_index = (longs>0) & (longs<1)
bad_longs = longs[bad_index]
bad_lats = lats[bad_index]

ranges = [[np.min(lats), np.max(lats)], [np.min(longs), np.max(longs)]]
h, lat_range, long_range = np.histogram2d(lats, longs, bins=(300,300), range=ranges)
bad_h, lat_range2, long_range2 = np.histogram2d(bad_lats, bad_longs, bins=(300,300), range=ranges)

y, x = np.where(h)
count = h[y, x]
pl.scatter(long_range[x], lat_range[y], s=count/20, c=count, linewidths=0, cmap="Blues")

count = bad_h[y, x]
pl.scatter(long_range2[x], lat_range2[y], s=count/20, c=count, linewidths=0, cmap="Reds")

pl.show()

The output figure is:

enter image description here

回复收藏 0 原文