将`h5`文件分开并将零件混合回去

发布于 01-24 02:56 字数 1635 浏览 3 评论 0 原文

我有一个 H5 文件,该文件基本上是通过 keras 的模型权重输出。对于某些存储要求,我想将大型 H5 文件拆分为较小的零件,并在需要时将它们组合回单个文件中。但是,我这样做的方式似乎错过了一些“元数据”(不确定,也许它丢失了很多,但是从组合文件的大小和原始文件来看,我似乎并没有丢失太多)。

这是我的分配脚本:

prefix = "model_weights"
fname_src = "DiffusiveSizeFactorAI/model_weights.h5"
size_max = 90 * 1024**2  # maximum size allowed in bytes
is_file_open = False
dest_fnames = []
idx = 0

with h5py.File(fname_src, "r") as src:
    for group in src:
        fname = f"{prefix}_{idx}.h5"
        if not is_file_open:
            dest = h5py.File(fname, "w")
            dest_fnames.append(fname)
            is_file_open = True
        group_id = dest.require_group(group)
        src.copy(f"/{group}", group_id)
        size = os.path.getsize(fname)
        if size > size_max:
            dest.close()
            idx += 1
            is_file_open = False
    dest.close()

这是我用来将脚本组合起来的脚本:

fname_combined = f"{prefix}_combined.h5"

with h5py.File(fname_combined, "w") as combined:
    for fname in dest_fnames:
        with h5py.File(fname, "r") as src:
            for group in src:
                group_id = combined.require_group(group)
                src.copy(f"/{group}", group_id)

如果有助于调试我的情况,请添加一点上下文,当我加载“组合”模型权重时,这是我得到的错误:

ValueError: Layer count mismatch when loading weights from file. Model expected 108 layers, found 0 saved layers.

注意:原始文件的大小和组合的大小大致相同(它们差异少于0.5%),这就是为什么我认为我可能会缺少一些元数据。

I have an h5 file, which is basically model weights output by keras. For some storage requirements, I'd like to split up the large h5 file into smaller pieces, and combine them back into a single file when needed. However, the way I do it seems to miss some "metadata" (not sure, maybe it's missing a lot more, but judging by the size of the combined file and the original file, it seems that I'm not missing much).

Here's my splitting script:

prefix = "model_weights"
fname_src = "DiffusiveSizeFactorAI/model_weights.h5"
size_max = 90 * 1024**2  # maximum size allowed in bytes
is_file_open = False
dest_fnames = []
idx = 0

with h5py.File(fname_src, "r") as src:
    for group in src:
        fname = f"{prefix}_{idx}.h5"
        if not is_file_open:
            dest = h5py.File(fname, "w")
            dest_fnames.append(fname)
            is_file_open = True
        group_id = dest.require_group(group)
        src.copy(f"/{group}", group_id)
        size = os.path.getsize(fname)
        if size > size_max:
            dest.close()
            idx += 1
            is_file_open = False
    dest.close()

and here's the script that I use for combining back the pieces:

fname_combined = f"{prefix}_combined.h5"

with h5py.File(fname_combined, "w") as combined:
    for fname in dest_fnames:
        with h5py.File(fname, "r") as src:
            for group in src:
                group_id = combined.require_group(group)
                src.copy(f"/{group}", group_id)

Just to add a little bit of context if it helps debugging my case, when I load the "combined" model weights, here's the error I'm getting:

ValueError: Layer count mismatch when loading weights from file. Model expected 108 layers, found 0 saved layers.

Note: the size of the original file and the combined one are about the same (they differ by less than 0.5%), which is why I think that I might be missing some metadata.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

宫墨修音 2025-01-31 02:56:42

我想知道您的问题是否有其他解决方案。我假设您想将模型部署在嵌入式系统上,从而导致内存限制。如果是这种情况,这里有一些替代方法:

  1. 使用 tensorflow lite lite :声称它大大降低了模型的大小(尚未真正测试过)。它还改善了ML部署边缘的其他重要方面。总而言之,您可以使尺寸高达x5倍。

  2. 应用 pruning> pruning>达到模型稀疏性。稀疏模型更容易压缩,因此可以跳过推断过程中的零以改善潜伏期。


I am wondering if there is an alternative solution to your problem. I am assuming you want to deploy the model on an embedded system, which leads to memory restrictions. If that is the case, here are some alternatives:

  1. Use TensorFlow Lite: claims that it significantly reduces the size of the model (haven't really tested this). It also improves other important aspects of ML deployment on the edge. In summary, you can make the size up to x5 times smaller.

  2. Apply Pruning: pruning gradually zeroes out model weights during the training process to achieve model sparsity. Sparse models are easier to compress, and thus the zeroes during inference can be skipped for latency improvements.

甜是你 2025-01-31 02:56:42

基于答案来自 h5py 开发人员,有两个问题:

  1. 每次以这种方式复制 H5 文件时,都会将重复的额外文件夹级别添加到目标文件中。让我解释一下:

假设在 src.h5 中,我有以下结构:/a/a/b/c 。在这两行中:

group_id = dest.require_group(group)
src.copy(f"/{group}", group_id)

group /a ,因此,在复制后,将添加一个额外的/a 。 H5 ,会导致以下错误的构造:/a/a/a/a/b/c 。为了解决这个问题,需要明确将 name =“ a” 作为参数传递给复制

  1. 根级的元数据“/”没有在分裂中也没有复制在脚本中。为了解决这个问题,鉴于 h5 数据结构与Python的 dict 非常相似,您只需要添加:
dest.attrs.update(src.attrs)

供个人使用,我已经写了两个助手功能将大型 H5 文件分为较小的零件,每个零件不超过指定的大小(用户作为参数传递),另一个将它们组合回单个 H5 文件。如果您觉得有用,可以在github

Based on an answer from h5py developers, there are two issues:

  1. Every time an h5 file is copied this way, a duplicate extra folder level will be added to the destination file. Let me explain:

Suppose in src.h5, I have the following structure: /A/B/C. In these two lines:

group_id = dest.require_group(group)
src.copy(f"/{group}", group_id)

group is /A, and so, after copying, an extra /A will be added to dest.h5, which results in the following erroneous struction: /A/A/B/C. To fix that, one needs to explicitly pass name="A" as an argument to copy.

  1. Metadata of the root level "/" is not being copied neither in the splitting nor in the combining script. To fix that, given that h5 data structure is very similar to Python's dict, you just need to add:
dest.attrs.update(src.attrs)

For personal use, I've written two helper functions, one that splits up a large h5 file into smaller parts, each not exceeding a specified size (passed as argument by user), and another one that combines them back into a single h5 file. In case you find it useful, it can be found on Github here.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文