从 .mat 文件中删除变量
这里有人知道如何从 matlab 文件中删除变量吗?我知道您可以使用 save -append 方法将变量添加到现有的 matlab 文件中,但没有有关如何从文件中删除变量的文档。
在有人说“只需保存它”之前,这是因为我将中间处理步骤保存到磁盘以缓解内存问题,最终每个分析例程将有近 10 GB 的中间数据。谢谢!
Does anyone here know how to delete a variable from a matlab file? I know that you can add variables to an existing matlab file using the save -append
method, but there's no documentation on how to delete variables from the file.
Before someone says, "just save it", its because I'm saving intermediate processing steps to disk to alleviate memory problems, and in the end there will be almost 10 GB of intermediate data per analysis routine. Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
有趣的是,您可以将
-append
选项与 SAVE 一起使用有效删除 .mat 文件中的数据。请注意文档中的摘录(我添加的粗体):换句话说,如果 .mat 文件中的变量名为
A
,您可以使用A
的新副本保存该变量(您已使用-append
选项设置为[]
)。 .mat 文件中仍然会有一个名为A
的变量,但它将为空,从而减少总文件大小。下面是一个示例:
文件大小约为 7.21 MB。现在执行此操作:
现在文件大小约为 169 字节。该变量仍然在那里,但它是空的。
Interestingly enough, you can use the
-append
option with SAVE to effectively erase data from a .mat file. Note this excerpt from the documentation (bold added by me):In other words, if a variable in your .mat file is called
A
, you can save over that variable with a new copy ofA
(that you've set to[]
) using the-append
option. There will still be a variable calledA
in the .mat file, but it will be empty and thus reduce the total file size.Here's an example:
The file size will be about 7.21 MB. Now do this:
And now the file size will be around 169 bytes. The variable is still in there, but it is empty.
10 GB 数据?由于 MAT 格式开销,更新多变量 MAT 文件可能会变得昂贵。考虑拆分数据并将每个变量保存到不同的 MAT 文件中,必要时使用目录进行组织。即使您有一个方便的函数来从 MAT 文件中删除变量,它的效率也会很低。 MAT 文件中的变量是连续排列的,因此替换一个变量可能需要读取和写入其余大部分变量。如果它们位于单独的文件中,您只需删除整个文件即可,速度很快。
要查看其实际效果,请尝试此代码,在调试器中单步调试它,同时使用 Process Explorer(在 Windows 上)之类的工具来监视其 I/O 活动。
在我的机器上,结果如下所示。 (读取和写入是累积的,时间是每次操作的时间。)
请注意,更新小 x 变量比更新大 y 更昂贵。大部分 I/O 活动都是“冗余”的内务工作,目的是保持 MAT 文件格式的组织有序,如果每个变量都在自己的文件中,这些活动就会消失。
另外,尝试将这些文件保留在本地文件系统上;它会比网络驱动器快得多。如果它们需要存储在网络驱动器上,请考虑对本地临时文件(可以使用 tempname() 选择)执行 save() 和 load() 操作,然后将它们复制到网络驱动器或从网络驱动器复制它们。对于本地文件系统,Matlab 的保存和加载往往要快得多,足以使本地保存/加载加上副本可以成为实质性的净胜利。
这是一个基本实现,可让您使用熟悉的 save() 和 load() 签名将变量保存到单独的文件中。它们以“d”为前缀,表示它们是基于目录的版本。他们使用了 evalin() 和 allocatein() 的一些技巧,所以我认为值得发布完整的代码。
这是 load() 等效项。
和 ddelete() 按照您的要求删除各个变量。
, ''); end out = struct; for i = 1:numel(varNames) name = varNames{i}; tmp = load(fullfile(file, [name '.mat'])); out.(name) = tmp.(name); end if nargout == 0 for i = 1:numel(varNames) assignin('caller', varNames{i}, out.(varNames{i})); end clear out endDwhos() 相当于 whos('-file')。
和 ddelete() 按照您的要求删除各个变量。
10 GB of data? Updating multi-variable MAT files could get expensive due to MAT format overhead. Consider splitting the data up and saving each variable to a different MAT file, using directories for organization if necessary. Even if you had a convenient function to delete variables from a MAT file, it would be inefficient. The variables in a MAT file are layed out contiguously, so replacing one variable can require reading and writing much of the rest. If they're in separate files, you can just delete the whole file, which is fast.
To see this in action, try this code, stepping through it in the debugger while using something like Process Explorer (on Windows) to monitor its I/O activity.
On my machine, the results look like this. (Read and Write are cumulative, Time is per operation.)
Notice that updating the small x variable is more expensive than updating the large y. Much of this I/O activity is "redundant" housekeeping work to keep the MAT file format organized, and will go away if each variable is in its own file.
Also, try to keep these files on the local filesystem; it'll be a lot faster than network drives. If they need to go on a network drive, consider doing the save() and load() on local temp files (maybe chosen with tempname()) and then copying them to/from the network drive. Matlab's save and load tend to be much faster with local filesystems, enough so that local save/load plus a copy can be a substantial net win.
Here's a basic implementation that will let you save variables to separate files using the familiar save() and load() signatures. They're prefixed with "d" to indicate they're the directory-based versions. They use some tricks with evalin() and assignin(), so I thought it would be worth posting the full code.
Here's the load() equivalent.
And ddelete() to delete the individual variables like you asked.
, ''); end out = struct; for i = 1:numel(varNames) name = varNames{i}; tmp = load(fullfile(file, [name '.mat'])); out.(name) = tmp.(name); end if nargout == 0 for i = 1:numel(varNames) assignin('caller', varNames{i}, out.(varNames{i})); end clear out endDwhos() is the equivalent of whos('-file').
And ddelete() to delete the individual variables like you asked.
据我所知,执行此操作的唯一方法是使用 MAT 文件 API 函数
matDeleteVariable
。我想,编写一个 Fortran 或 C 例程来完成此操作是相当容易的,但对于本应更容易的事情来说,这似乎需要付出很大的努力。The only way of doing this that I know is to use the MAT-file API function
matDeleteVariable
. It would, I guess, be quite easy to write a Fortran or C routine to do this, but it does seem like a lot of effort for something that ought to be much easier.我建议您从要保留的 .mat 文件中加载变量,并将它们保存到新的 .mat 文件中。如有必要,您可以循环加载和保存(使用
'-append'
)。I suggest you load the variables from the .mat file you want to keep, and save them to a new .mat file. If necessary, you can load and save (using
'-append'
) in a loop.