莫名其妙的 Mathematica7 DumpSave[] 问题

发布于 2024-09-15 23:40:04 字数 2295 浏览 3 评论 0原文

我有一个非常大的浮点图像数据数组:

In[25]:= Dimensions[daylightImgd]
Out[25]= {18, 2002, 2989}

In[26]:= daylightImgd[[1, 1]][[1 ;; 10]]

Out[26]= {0.0122293, 0.0104803, 0.0103955, 0.0115533, 0.0118063, \
0.0120648, 0.0122957, 0.011398, 0.0117426, 0.0119997}

我可以使用 DumpSave 成功地将整个图像数组保存到磁盘上:

DumpSave["thisWorks.mx", daylightImgd]

转储这个巨大的(861 meg 文件)需要不到 10 秒的时间。如果我对这些图像进行下采样,a la:

downsample[image_, f_] := Module[{w, h}, h = Dimensions[image][[1]];
  w = Dimensions[image][[2]];
  Table[image[[i, j]], {i, 1, h, f}, {j, 1, w, f}]]

In[26]:= daylightImgdDown = downsample[#, 4] & /@ daylightImgd;
In[27]:= Dimensions[daylightImgdDown]
Out[27]= {18, 500, 748}

In[28]:= daylightImgdDown[[1, 1]][[1 ;; 10]]

Out[28]= {0.0122293, 0.0118063, 0.0117426, 0.0119349, 0.0109443, \
0.0121632, 0.0121304, 0.00681408, 0.0101728, 0.00603242}

然后突然我不能再转储保存了;这东西永远挂起并旋转——或者至少持续很多分钟,直到我杀死它,并最大化 CPU:

In[31]:= DumpSave["broken.mx", daylightImgdDown];    (* Hangs forever *)

据我所知,一切都应该是这样:下采样的图像具有正确的尺寸;你可以通过 ArrayPlot 绘制它们,一切看起来都很棒;手动列出第一行看起来不错。简而言之,一切看起来与非下采样图像相同,但在较小的数据集上 DumpSave 挂起。

帮助?

更新:对迈克尔的回答的评论

哇。感谢您非常彻底的回答,它不仅回答了我的问题,还教会了我一些外围知识。

供您参考,打包问题比用您的 downsample[] 替换我的 downsample[] 稍微棘手一些。由于我试图转储的数组是一个包含 18 个图像的数组(换句话说,是一个 3d 数组),并且由于我通过 Map 运算符应用下采样,因此 3d 数组的压缩性为 false(根据 PackedArrayQ) )使用您的任何一个下采样重写。

但是,如果我获取这些应用程序的输出,然后打包生成的 3d 数组,那么整个 3d 数组都会被打包,并且只有这样我才能 DumpSave 它。但奇怪的是,正如 ByteCount 报道的那样,最后的打包过程虽然对于成功的 DumpSave 是必要的,但似乎几乎没有改变大小。也许这在代码中更容易:

In[42]:= downsample3[image_, f_] := 
 Module[{w, h}, h = Dimensions[image][[1]];
  w = Dimensions[image][[2]];
  Developer`ToPackedArray@Table[image[[i, j]], {i, 1, h, f}, {j, 1, w, f}]]

In[43]:= daylightImgdDown = downsample3[#, downsampleSize] & /@ daylightImgd;
In[44]:= ByteCount[daylightImgdDown]
Out[44]= 53966192

In[45]:= Developer`PackedArrayQ[daylightImgdDown]
Out[45]= False

In[46]:= dd = Developer`ToPackedArray[daylightImgdDown];
In[47]:= Developer`PackedArrayQ[dd]
Out[47]= True

In[48]:= ByteCount[dd]
Out[48]= 53963844

In[49]:= DumpSave["daylightImgdDown.mx", dd]; (* works now! *)

再次非常感谢。

I have a very large array of floating point image data:

In[25]:= Dimensions[daylightImgd]
Out[25]= {18, 2002, 2989}

In[26]:= daylightImgd[[1, 1]][[1 ;; 10]]

Out[26]= {0.0122293, 0.0104803, 0.0103955, 0.0115533, 0.0118063, \
0.0120648, 0.0122957, 0.011398, 0.0117426, 0.0119997}

I can save this whole image array to disk successfully using DumpSave a la:

DumpSave["thisWorks.mx", daylightImgd]

Dumping this giant (861 meg file) takes less than 10 seconds. If I downsample these images, a la:

downsample[image_, f_] := Module[{w, h}, h = Dimensions[image][[1]];
  w = Dimensions[image][[2]];
  Table[image[[i, j]], {i, 1, h, f}, {j, 1, w, f}]]

In[26]:= daylightImgdDown = downsample[#, 4] & /@ daylightImgd;
In[27]:= Dimensions[daylightImgdDown]
Out[27]= {18, 500, 748}

In[28]:= daylightImgdDown[[1, 1]][[1 ;; 10]]

Out[28]= {0.0122293, 0.0118063, 0.0117426, 0.0119349, 0.0109443, \
0.0121632, 0.0121304, 0.00681408, 0.0101728, 0.00603242}

Then suddenly I can't dumpsave anymore; the thing hangs and spins forever -- or at least for many minutes, till I kill it, and maxes CPU:

In[31]:= DumpSave["broken.mx", daylightImgdDown];    (* Hangs forever *)

So far as I can determine, everything is as it should be: the downsampled images have the right dimensions; you can plot them via ArrayPlot and everything looks great; manually listing the first row looks fine. Everything, in short, appears the same as with the non-down-sampled images, yet on the much smaller dataset DumpSave hangs.

Help?

UPDATE: Comments on Michael's answer

Wow. Thank you for the extremely thorough answer, which not only answered my question but taught me a bunch of peripheral stuff, too.

For your reference, the issue of packed-ness was a little trickier than just replacing my downsample[] with one of yours. Since the array I was trying to dump is an array of 18 images - a 3d array, in other words - and since I'm applying the downsampling via the Map operator, the packed-ness of the 3d array is false (according to PackedArrayQ) using either of your downsample rewrites.

However, if I take the output of those applications, and then pack the resultant 3d array, then the whole 3d array is packed, and only then can I DumpSave it. Weirdly, though, this final process of packing, while necessary for a successful DumpSave, barely seems to alter the size, as reported by ByteCount. Maybe this is easier in code:

In[42]:= downsample3[image_, f_] := 
 Module[{w, h}, h = Dimensions[image][[1]];
  w = Dimensions[image][[2]];
  Developer`ToPackedArray@Table[image[[i, j]], {i, 1, h, f}, {j, 1, w, f}]]

In[43]:= daylightImgdDown = downsample3[#, downsampleSize] & /@ daylightImgd;
In[44]:= ByteCount[daylightImgdDown]
Out[44]= 53966192

In[45]:= Developer`PackedArrayQ[daylightImgdDown]
Out[45]= False

In[46]:= dd = Developer`ToPackedArray[daylightImgdDown];
In[47]:= Developer`PackedArrayQ[dd]
Out[47]= True

In[48]:= ByteCount[dd]
Out[48]= 53963844

In[49]:= DumpSave["daylightImgdDown.mx", dd]; (* works now! *)

Again, thanks very much.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

慕巷 2024-09-22 23:40:04

在没有实际数据的情况下,有根据的猜测是,大数组 DumpSave 速度快的原因是因为它是所谓的“压缩数组”,即机器大小的浮点数数组在 Mathematica 中有一个非常有效的表示。您的 downsample 函数(由于使用了 Table)不会返回压缩数组,该数组在内存中要大得多,甚至可能比下采样 4 倍后的原始数组还要大。 ByteCount 可能是说明性的。

您可以使用 PackedArrayQ 检查打包数组的性质 并尝试使用 ToPackedArray,两者都可以在 Developer 上下文中找到。

如果我的猜测是正确的话,有两种解决方案。一种是使用 ToPackedArray,如图所示:

downsample[image_, f_] := Module[{w, h}, h = Dimensions[image][[1]];
  w = Dimensions[image][[2]];
  Developer`ToPackedArray@Table[image[[i, j]], {i, 1, h, f}, {j, 1, w, f}]]

更好的方法是简单地用 Take 替换 Table 的使用,它应该返回一个打包数组情况下,作为一个额外的好处,它比使用 Table 快得多。

downsample[image_, f_] := Take[image, {1,-1,f}, {1,-1,f}]

您可能还对 Mathematica 7 中的所有新图像处理功能感兴趣。

希望有帮助!

Without the actual data, an educated guess is that the reason the large array DumpSaves quickly is because it is a so-called "packed array", that is, an array of machine-sized floating point numbers that has a very efficient representation in Mathematica. Your downsample function (due to the use of Table) does not return a packed array, which is much larger in memory, potentially larger than even the original array after being downsampled 4X. ByteCount might be illustrative there.

You can check for packed-array-ness with PackedArrayQ and attempt to pack an unpacked array with ToPackedArray, both found in the Developer context.

There are two solutions, if my guess is correct. One is to use ToPackedArray as shown:

downsample[image_, f_] := Module[{w, h}, h = Dimensions[image][[1]];
  w = Dimensions[image][[2]];
  Developer`ToPackedArray@Table[image[[i, j]], {i, 1, h, f}, {j, 1, w, f}]]

Even better is to simply replace your use of Table with Take, which should return a packed array in this case, and as an added bonus be a lot faster than using Table.

downsample[image_, f_] := Take[image, {1,-1,f}, {1,-1,f}]

You might also be interested in all the new image processing functionality in Mathematica 7.

Hope that helps!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文