如何从python编写的H5PY重组数据

发布于 2025-01-31 05:36:14 字数 1376 浏览 3 评论 0原文

我正在使用H5PY库从现有文件中读取功能。

readFile = h5py.File('File',r)

使用readfile.keys()我获得了存储在“文件”中的函数的列表。这些功能之一是函数PHI。要打印函数phi,我

phi = numpy.array(readFile['phi'])[:,0,:,:]

在[:0,0,:,:]中做到了数据的存储方式[块,z,y,x]。 Z = 0,因为它是2D情况。 x分为2个块​​,y分为2个块​​。每个X块都分为NXB(X1,X2,....,X20),每个Y块分为NYB。 (NXB和NYB也可以使用H5PY直接从文件中获取,因为它们也存储在文件中。数据的域也存储在文件中,并且称为['bounding box'])

然后,对网格进行编码将是:

nxb = numpy.array(readFile['integer scalars'])[0][1]
nyb = numpy.array(readFile['integer scalars'])[1][1]
X = numpy.zeros([block, nxb, nyb])
Y = numpy.zeros([block, nxb, nyb])
for block in range(block):
    x_min, x_max = numpy.array(readFile['bounding box'])[block,0,:]
    y_min, y_max = numpy.array(readFile['bounding box'])[block,1,:]
    X[block,:,:], Y[block,:,:] = numpy.meshgrid(numpy.linspace(x_min,x_max,nxb), 
                                                numpy.linspace(y_min,y_max,nyb))

我的问题是,我正在尝试重组数据(请参见图)。我想将块2的数据带入块1的数据,而不是在他旁边。这意味着我需要创建与旧坐标I相关的新坐标I和J',J。我尝试过,但它不起作用:

for i in range(X):
    for j in range(Y):
        i' = i -len(X[0:1,:,:]
        j' = j + len(Y[0:1,:,:]
        phi(i',j') = phi



    

enter image description hereI am reading functions from an existing file using h5py library.

readFile = h5py.File('File',r)

using readFile.keys() I obtained the list of the functions stored in 'File'. One of these functions is the function phi. To print the function phi, I did

phi = numpy.array(readFile['phi'])[:,0,:,:]

in [:,0,:,:] we find the way how the data is stored [blocks, z, y, x]. z= 0 because it is a 2D case. x is divided in 2 blocks, and y is divided to 2 blocks. each x block is divided to nxb (x1, x2, ....,x20), and each y block is divided to nyb. (nxb and nyb can also be obtained directly from the file using h5py as they are also stored in the file. The domain of the data is also stored in the file and it is called ['bounding box'])

Then , coding the grid will be:

nxb = numpy.array(readFile['integer scalars'])[0][1]
nyb = numpy.array(readFile['integer scalars'])[1][1]
X = numpy.zeros([block, nxb, nyb])
Y = numpy.zeros([block, nxb, nyb])
for block in range(block):
    x_min, x_max = numpy.array(readFile['bounding box'])[block,0,:]
    y_min, y_max = numpy.array(readFile['bounding box'])[block,1,:]
    X[block,:,:], Y[block,:,:] = numpy.meshgrid(numpy.linspace(x_min,x_max,nxb), 
                                                numpy.linspace(y_min,y_max,nyb))

My question, is that I am trying to restructure the data (see the figure). I want to bring the data of the block 2 up to the data of the block 1 and not next to him. Which means that I need to create new coordinates I' and J' related to the old coordinates I , and J. I tried this but it is not working:

for i in range(X):
    for j in range(Y):
        i' = i -len(X[0:1,:,:]
        j' = j + len(Y[0:1,:,:]
        phi(i',j') = phi



    

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

诗笺 2025-02-07 05:36:14

使用HDF5数据时,在开始编写代码之前,请先了解数据模式。这是我最初的观察和建议。

您的问题很难遵循。 (例如,您正在使用术语“函数”来描述HDF5数据集。)HDF5在数据集和组中组织数据。您感兴趣的数据在2个数据集中:'PHI''整数标量'

您可以使用以下内容来简化代码以访问数据集,以作为numpy数组:

with h5py.File('File','r') as readFile:
    # to get the axis dimensions for 'phi':
    print(f"Shape of Dataset phi: {readFile['phi'].shape}")
    phi_ds = readFile['phi']  # to get a dataset object
    phi_arr = readFile['phi'][()]  # to read dataset as a numpy array

    # to get the axis dimensions for 'integer scalars'
    nxb, nyb = readFile['integer scalars'].shape

我不明白您的含义“ blocks”。您是在推荐轴模拟吗?另外,为什么您使用meshgrid?如果您只想更改尺寸,请使用numpy的.reshape()方法来更改numpy数组的轴尺寸。

这是一个创建2x2数据集的简单示例,然后将其读取为新数组并将其重塑为1x4。我认为这就是您想做的。如果要增加大小,请更改A0和A1的值。重塑操作将读取第一个数组的形状,并将新数组重塑为(n,1),其中n是您的nxb*nyb*nyb值。

with h5py.File('SO_72340647.h5','w') as h5f:
    a0, a1 = 2,2
    arr = np.arange(a0*a1).reshape(a0,a1)
    h5f.create_dataset('ds_2x2',data=arr)

with h5py.File('SO_72340647.h5','r') as h5f:
    print(f"Shape of Dataset ds_2x2: {h5f['ds_2x2'].shape}")
    ds_arr = h5f['ds_2x2'][()]
    print(ds_arr)
    ds0, ds1 = ds_arr.shape
    new_arr = ds_arr.reshape(ds0*ds1,1)
    print(f"Shape of new (reshaped) array: {new_arr.shape}")
    print(new_arr)

注意:H5PY数据集对象“表现为” Numpy数组。因此,您通常不必阅读数组即可使用数据。

When working with HDF5 data, it's important to understand your data schema before you start writing code. Here are my initial observations and suggestions.

Your question is a little hard to follow. (For example, you are using the term "functions" to describe HDF5 datasets.) HDF5 organizes data in datasets and groups. Your data of interest is in 2 datasets: 'phi' and 'integer scalars'.

You can simplify code to access the datasets as a Numpy arrays using the following:

with h5py.File('File','r') as readFile:
    # to get the axis dimensions for 'phi':
    print(f"Shape of Dataset phi: {readFile['phi'].shape}")
    phi_ds = readFile['phi']  # to get a dataset object
    phi_arr = readFile['phi'][()]  # to read dataset as a numpy array

    # to get the axis dimensions for 'integer scalars'
    nxb, nyb = readFile['integer scalars'].shape

I don't understand what you mean by "blocks". Are you referering to the axis simensions? Also, why you are using meshgrid? If you simply want to change dimensions, use Numpy's .reshape() method to change the axis dimensions of the Numpy array.

Here is a simple example that creates a 2x2 dataset, then reads it into a new array and reshapes it to 1x4. I think this is what you want to do. Change the values of a0 and a1 if you want to increase the size. The reshape operation will read the shape from the first array and reshape the new array to (N,1), where N is your nxb*nyb value.

with h5py.File('SO_72340647.h5','w') as h5f:
    a0, a1 = 2,2
    arr = np.arange(a0*a1).reshape(a0,a1)
    h5f.create_dataset('ds_2x2',data=arr)

with h5py.File('SO_72340647.h5','r') as h5f:
    print(f"Shape of Dataset ds_2x2: {h5f['ds_2x2'].shape}")
    ds_arr = h5f['ds_2x2'][()]
    print(ds_arr)
    ds0, ds1 = ds_arr.shape
    new_arr = ds_arr.reshape(ds0*ds1,1)
    print(f"Shape of new (reshaped) array: {new_arr.shape}")
    print(new_arr)

Note: h5py dataset objects "behave like" Numpy arrays. So, you frequently don't have to read into an array to use the data.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文