使用.H5格式读取文件并在数据集中使用
我有两个文件夹(一个用于火车,一个用于测试),每个文件夹都有大约10个以H5格式的文件。我想阅读它们并在数据集中使用它们。我有一个可以阅读它们的功能,但是我不知道如何使用它来阅读课堂中的文件。
def read_h5(path):
data = h5py.File(path, 'r')
image = data['image'][:]
label = data['label'][:]
return image, label
class Myclass(Dataset):
def __init__(self, split='train', transform=None):
raise NotImplementedError
def __len__(self):
raise NotImplementedError
def __getitem__(self, index):
raise NotImplementedError
你有建议吗? 先感谢您
I have two folders( one for train and one for test) and each one has around 10 files in h5 format. I want to read them and use them in a dataset. I have a function to read them, but I don't know how I can use it to read the file in my class.
def read_h5(path):
data = h5py.File(path, 'r')
image = data['image'][:]
label = data['label'][:]
return image, label
class Myclass(Dataset):
def __init__(self, split='train', transform=None):
raise NotImplementedError
def __len__(self):
raise NotImplementedError
def __getitem__(self, index):
raise NotImplementedError
Do you have a suggestion?
Thank you in advance
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这可能是您想做的事情的开始。我实现了
__ INT __()
,但不是__ len __()
或__ get_item __()
。用户提供路径,而INIT函数调用类方法read_h5()
获取图像和标签数据的数组。从2个不同的H5文件创建类对象的简短主体。使用文件夹和文件名修改路径
列表,以进行所有培训和测试数据。恕我直言,创建一个带有数组数据的类是过大的(如果您的数据集很大,可能会导致内存问题)。创建H5PY数据集对象并在需要时访问数据是更有效的内存效率。下面的示例与上面的代码相同,而不创建具有Numpy数组的类对象。
This might be a start for what you want to do. I implemented the
__init__()
, but not__len__()
or__get_item__()
. User provides the path, and the init function calls the class methodread_h5()
to get the arrays of image and label data. There is a short main to create a class objects from 2 different H5 files. Modify thepaths
list with folder and filenames for all of your training and testing data.IMHO, creating a class with the array data is overkill (and could lead to memory problems if you have really large datasets). It is more memory efficient to create h5py dataset objects, and access the data when you need it. Example below does the same as code above, without creating a class object with numpy arrays.