3维阵列重塑? HDF5数据集类型?
我有以下形状的数据:(127260,2,1250)
此数据的类型为< hdf5数据集“ data”:shape(127260,2,1250),类型为“ < f8“>
第一个维度(127260)是信号数,第二维(2)是信号的类型,第三维(1250)是每个尺寸的点数信号。
我想做的是减少每个信号的点量,将其切成两半,在每个信号上留下625点,然后将信号的量增加一倍。
如何将HDF5数据集转换为诸如Numpy数组之类的东西以及如何重塑?
I have data in the following shape: (127260, 2, 1250)
The type of this data is <HDF5 dataset "data": shape (127260, 2, 1250), type "<f8">
The first dimension (127260) is the number of signals, the second dimension (2) is the type of signal, and the third dimension (1250) is the amount of points in each of the signals.
What I wanted to do is reduce the amount of points for each signal, cut them in half, leave 625 points on each signal, and then have double the amount of signals.
How to convert HDF5 dataset to something like numpy array and how to do this reshape?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果我理解,您需要一个形状为:
(2*127260, 2, 625)
的新数据集。如果是这样,那么将数据集的 2 个切片读入 2 个 NumPy 数组,从切片创建一个新数组,然后写入新数据集是相当简单的。注意:读取切片既简单又快速。我会按原样保留数据并即时执行此操作,除非您有令人信服的理由来创建新的数据集代码来执行此操作(其中
h5f
是 h5py 文件对象):或者您可以做到这一点(并结合两个步骤):
这是第三种方法。这是最直接的方式,并且减少了内存开销。当您拥有无法容纳在内存中的非常大的数据集时,这一点非常重要。
无论您选择哪种方法,我建议添加一个属性来注释数据源以供将来参考:
If I understand, you want a new dataset with shape:
(2*127260, 2, 625)
. If so, it's fairly simple to read 2 slices of the dataset into 2 NumPy arrays, create a new array from the slices, then write to a new dataset. Note: reading slices is simple and fast. I would leave the data as-is and do this on-the-fly unless you have a compelling reason to create a new datasetCode to do this (where
h5f
is the h5py file object):Alternately you can do this (and combine 2 steps):
Here is a 3rd method. It is the most direct way, and reduces the memory overhead. This is important when you have very large datasets that won't fit in memory.
Whichever method you choose, I suggest adding an attribute to note the data source for future reference: