如何正确记录数据在HDF5文件下
我正在获得类型错误:对象dtype dtype('o')没有本机HDF5等效
。 这是我的Python代码; mel_train,mfcc_train和y_train
的dtype都是float32
。 数组形状为:mfcc_train:(6398,)
; mel_train:(6398,)
和y_train:(6398,16)
。
with h5py.File(train_file,'w') as f:
f['mfcc_train'] = mfcc_train
f['mel_train'] = mel_train
f['y_train'] = y_train
I am getting Type Error: Object dtype dtype('O') has no native HDF5 equivalent
.
Here is my python code;
dtype for mel_train, mfcc_train, and y_train
are all float32
.
Array shapes are: mfcc_train: (6398,)
; mel_train: (6398,)
and y_train: (6398, 16)
.
with h5py.File(train_file,'w') as f:
f['mfcc_train'] = mfcc_train
f['mel_train'] = mel_train
f['y_train'] = y_train
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
好的,我 思考 我知道发生了什么(受过教育的猜测)。您将音频数据提取到数组
mel
和mfcc
,然后添加到列表mel_train
和mfcc_train
(循环超过6398音频文件)。退出循环后,将列表转换为数组。如果每个mel
和MFCC
数组具有相同的形状(例如(m,n)
),则新数组将为Shape(6398) ,m,n)
,其中6398是len(mel_train)
。但是,我怀疑每个mel
和MFCC
数组的形状不同。结果,当您将不同形状的数组的列表转换为单个数组时,您将获得(6398,)
的数组形状,dtype = object
(其中对象是float32
数组)。为了证明差异,我创建了一个两个几乎相同的示例:
(10,2)
的数组列表,然后将列表转换为数组。注意最终数组的形状(5,10,2)
和dtype是float64
。您可以直接从此数组创建HDF5数据集。(5,)
和dtype是object
。您 不能直接从此数组中创建HDF5数据集。这就是为什么您获得typeError:对象dtype dtype('o')没有本机HDF5等效
。注意:我将
dtype = object
添加到np.asarray()
的第二种方法的功能,以避免visibledeprecationWarning
。示例2显示了2种加载数据的方法。它从示例1继续,并将数据加载到同一HDF5文件中。运行它们后,您可以比较数据集
mel_train1
,组mel_train2
和数据集mel_train3
。每个都有一个“注释”属性来描述数据。代码下面:
示例1-恒定形状阵列:
示例2-变量形状阵列:
加载示例2数据AS -IS将投掷一个例外
推荐的方法加载示例2数据
替代方法加载示例2数据(不推荐)
典型的运行代码中的典型输出:
OK, I think I know what's going on (educated guess). You extract the audio data to arrays
mel
andmfcc
, then add to listsmel_train
andmfcc_train
(looping over 6398 audio files). After you exit the loop, you convert the lists to arrays. If everymel
andmfcc
array has the same shape (say(m,n)
) the new arrays would be shape(6398,m,n)
, where 6398 islen(mel_train)
. However, I suspect eachmel
andmfcc
array has a different shape. As a result, when you convert the list of differently shaped arrays to a single array, you will get an array shape of(6398,)
withdtype=object
(where the objects arefloat32
arrays).To demonstrate the difference, I created an 2 nearly identical examples:
(10,2)
, adds to a list, then converts the list to an array. Note how the final array is shape(5,10,2)
and dtype isfloat64
. You can create a HDF5 dataset directly from this array.(5,)
and dtype isobject
. You cannot create a HDF5 dataset directly from this array. This is why you getTypeError: Object dtype dtype('O') has no native HDF5 equivalent
.Note: I added
dtype=object
to thenp.asarray()
function for the second method to avoid theVisibleDeprecationWarning
.Example 2 shows 2 methods to load data. It continues from Example 1 and loads data into the same HDF5 file. After you run them, you can compare dataset
mel_train1
, groupmel_train2
and datasetmel_train3
. Each has a "Note" attribute to describe the data.Code below:
Example 1 - constant shape arrays:
Example 2 - variable shape arrays:
Loading Example 2 data as-is will throw an exception
Recommended method to load Example 2 data
Alternate method to load Example 2 data (not recommended)
Typical output from running code above: