在Python中将不规则间隔的数据重新采样为规则网格

发布于 2024-09-26 08:13:04 字数 728 浏览 3 评论 0原文

我需要将二维数据重新采样为常规网格。

这就是我的代码的样子:

import matplotlib.mlab as ml
import numpy as np

y = np.zeros((512,115))
x = np.zeros((512,115))

# Just random data for this test:
data = np.random.randn(512,115)

# filling the grid coordinates:    
for i in range(512):
    y[i,:]=np.arange(380,380+4*115,4)

for i in range(115):
    x[:,i] = np.linspace(-8,8,512)
    y[:,i] -=  np.linspace(-0.1,0.2,512)

# Defining the regular grid
y_i = np.arange(380,380+4*115,4)
x_i = np.linspace(-8,8,512)

resampled_data = ml.griddata(x,y,data,x_i,y_i)

(512,115) 是 2D 数据的形状,并且我已经安装了 mpl_toolkits.natgrid。

我的问题是,我返回一个掩码数组,其中大多数条目都是 nan,而不是主要由常规条目组成且边界处只有 nan 的数组。

有人能指出我做错了什么吗?

谢谢!

I need to resample 2D-data to a regular grid.

This is what my code looks like:

import matplotlib.mlab as ml
import numpy as np

y = np.zeros((512,115))
x = np.zeros((512,115))

# Just random data for this test:
data = np.random.randn(512,115)

# filling the grid coordinates:    
for i in range(512):
    y[i,:]=np.arange(380,380+4*115,4)

for i in range(115):
    x[:,i] = np.linspace(-8,8,512)
    y[:,i] -=  np.linspace(-0.1,0.2,512)

# Defining the regular grid
y_i = np.arange(380,380+4*115,4)
x_i = np.linspace(-8,8,512)

resampled_data = ml.griddata(x,y,data,x_i,y_i)

(512,115) is the shape of the 2D data, and I already installed mpl_toolkits.natgrid.

My issue is that I get back a masked array, where most of the entries are nan, instead of an array that is mostly composed of regular entries and just nan at the borders.

Could someone point me to what I am doing wrong?

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

-残月青衣踏尘吟 2024-10-03 08:13:04

将您的代码示例与问题标题进行比较,我认为您有点困惑......

在您的示例代码中,您正在创建规则网格随机数据,然后将其重新采样到另一个常规数据网格。您的示例中的任何位置都没有不规则数据...

(此外,代码不会按原样运行,您应该查看 meshgrid 而不是循环生成 x 和 y 网格。)

如果您想要重新采样已经定期采样的网格,就像您在示例中所做的那样,有比 griddata 或我将在下面描述的任何内容更有效的方法。 (scipy.ndimage.map_coordinates 非常适合您的问题,在这种情况下。)

但是,根据您的问题,听起来您有不规则间隔的数据,想要将其插值到规则网格上。

在这种情况下,您可能会有这样的一些观点:

import numpy as np
import matplotlib.pyplot as plt
#import matplotlib.mlab as mlab # 2023 use instead:
from scipy.interpolate import griddata

# Bounds and number of the randomly generated data points
ndata = 20
xmin, xmax = -8, 8
ymin, ymax = 380, 2428

# Generate random data
x = np.random.randint(xmin, xmax, ndata)
y = np.random.randint(ymin, ymax, ndata)
z = np.random.random(ndata)

# Plot the random data points
plt.scatter(x,y,c=z)
plt.axis([xmin, xmax, ymin, ymax])
plt.colorbar()
plt.show()

Randomly generated data

然后您可以像之前一样插入数据...(接上面的代码片段...)

# Size of regular grid
ny, nx = 512, 115

# Generate a regular grid to interpolate the data.
xi = np.linspace(xmin, xmax, nx)
yi = np.linspace(ymin, ymax, ny)
xi, yi = np.meshgrid(xi, yi)

# Interpolate using delaunay triangularization 
#zi = mlab.griddata(x,y,z,xi,yi) # 2023 use instead:
zi = griddata( (x,y), z, (xi,yi) )

# Plot the results
plt.figure()
plt.pcolormesh(xi,yi,zi)
plt.scatter(x,y,c=z)
plt.colorbar()
plt.axis([xmin, xmax, ymin, ymax])
plt.show()

Poorly interpolated data

但是,您会注意到您在网格中获得大量工件。这是因为 x 坐标范围从 -8 到 8,而 y 坐标范围从 ~300 到 ~2500。插值算法试图使事物各向同性,而您可能需要高度各向异性插值(以便在绘制网格时呈现各向同性)。

为了纠正这个问题,您需要创建一个新的坐标系来进行插值。没有一种正确的方法可以做到这一点。我下面使用的方法是可行的,但“最佳”方法在很大程度上取决于您的数据实际代表的内容。

(换句话说,使用您对数据测量的系统的了解来决定如何执行此操作。对于插值来说总是!您不应该进行插值,除非您知道结果应该是什么样子,并且足够熟悉插值算法,可以利用先验信息来发挥自己的优势!还有比 griddata 使用的 Delaunay 三角剖分更灵活的插值算法!默认情况下也是如此,但对于一个简单的示例来说这很好...)

无论如何,一种方法是重新调整 x 和 y 坐标,使它们的范围大致相同。在这种情况下。我们将把它们从 0 重新缩放到 1...(原谅意大利面条字符串代码...我只是想以此作为一个例子...)

# (Continued from examples above...)
# Normalize coordinate system
def normalize_x(data):
    data = data.astype(np.float)
    return (data - xmin) / (xmax - xmin)

def normalize_y(data):
    data = data.astype(np.float)
    return (data - ymin) / (ymax - ymin)

x_new, xi_new = normalize_x(x), normalize_x(xi)
y_new, yi_new = normalize_y(y), normalize_y(yi)

# Interpolate using delaunay triangularization 
#zi = mlab.griddata(x_new, y_new, z, xi_new, yi_new) # 2023 use instead:
zi = griddata( (x_new, y_new), z, (xi_new, yi_new) )

# Plot the results
plt.figure()
plt.pcolormesh(xi,yi,zi)
plt.scatter(x,y,c=z)
plt.colorbar()
plt.axis([xmin, xmax, ymin, ymax])
plt.show()

在标准化坐标系中插值的数据

希望无论如何能有所帮助...对于答案的长度感到抱歉!

Comparing your code example to your question's title, I think you're a bit confused...

In your example code, you're creating regularly gridded random data and then resampling it onto another regular grid. You don't have irregular data anywhere in your example...

(Also, the code doesn't run as-is, and you should look into meshgrid rather than looping through to generate your x & y grids.)

If you're wanting to re-sample an already regularly-sampled grid, as you do in your example, there are more efficient methods than griddata or anything I'm about to describe below. (scipy.ndimage.map_coordinates would be well suited to your problem, it that case.)

Based on your question, however, it sounds like you have irregularly spaced data that you want to interpolate onto a regular grid.

In that case, you might have some points like this:

import numpy as np
import matplotlib.pyplot as plt
#import matplotlib.mlab as mlab # 2023 use instead:
from scipy.interpolate import griddata

# Bounds and number of the randomly generated data points
ndata = 20
xmin, xmax = -8, 8
ymin, ymax = 380, 2428

# Generate random data
x = np.random.randint(xmin, xmax, ndata)
y = np.random.randint(ymin, ymax, ndata)
z = np.random.random(ndata)

# Plot the random data points
plt.scatter(x,y,c=z)
plt.axis([xmin, xmax, ymin, ymax])
plt.colorbar()
plt.show()

Randomly generated data

You can then interpolate the data as you were doing before... (Continued from code snippet above...)

# Size of regular grid
ny, nx = 512, 115

# Generate a regular grid to interpolate the data.
xi = np.linspace(xmin, xmax, nx)
yi = np.linspace(ymin, ymax, ny)
xi, yi = np.meshgrid(xi, yi)

# Interpolate using delaunay triangularization 
#zi = mlab.griddata(x,y,z,xi,yi) # 2023 use instead:
zi = griddata( (x,y), z, (xi,yi) )

# Plot the results
plt.figure()
plt.pcolormesh(xi,yi,zi)
plt.scatter(x,y,c=z)
plt.colorbar()
plt.axis([xmin, xmax, ymin, ymax])
plt.show()

Poorly interpolated data

However, you'll notice that you're getting lots of artifacts in the grid. This is due to the fact that your x coordinates range from -8 to 8, while y coordinates range from ~300 to ~2500. The interpolation algorithm is trying to make things isotropic, while you may want a highly anisotropic interpolation (so that it appears isotropic when the grid is plotted).

To correct for this, you need to create a new coordinate system to do your interpolation in. There is no one right way to do this. What I'm using below will work, but the "best" way depends heavily on what your data actually represents.

(In other words, use what you know about the system that your data is measuring to decide how to do it. This is always true with interpolation! You should not interpolate unless you know what the result should look like, and are familiar enough with the interpolation algorithm to use that a priori information to your advantage!! There are also much more flexible interpolation algorithms than the Delaunay triangulation that griddata uses by default, as well, but it's fine for a simple example...)

At any rate, one way to do this is to rescale the x and y coordinates so that they range over roughly the same magnitudes. In this case. we'll rescale them from 0 to 1... (forgive the spaghetti string code... I'm just intending this to be an example...)

# (Continued from examples above...)
# Normalize coordinate system
def normalize_x(data):
    data = data.astype(np.float)
    return (data - xmin) / (xmax - xmin)

def normalize_y(data):
    data = data.astype(np.float)
    return (data - ymin) / (ymax - ymin)

x_new, xi_new = normalize_x(x), normalize_x(xi)
y_new, yi_new = normalize_y(y), normalize_y(yi)

# Interpolate using delaunay triangularization 
#zi = mlab.griddata(x_new, y_new, z, xi_new, yi_new) # 2023 use instead:
zi = griddata( (x_new, y_new), z, (xi_new, yi_new) )

# Plot the results
plt.figure()
plt.pcolormesh(xi,yi,zi)
plt.scatter(x,y,c=z)
plt.colorbar()
plt.axis([xmin, xmax, ymin, ymax])
plt.show()

Data interpolated in a normalized coordinate system

Hope that helps, at any rate... Sorry for the length of the answer!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文