莫名其妙的 Mathematica7 DumpSave[] 问题
我有一个非常大的浮点图像数据数组:
In[25]:= Dimensions[daylightImgd]
Out[25]= {18, 2002, 2989}
In[26]:= daylightImgd[[1, 1]][[1 ;; 10]]
Out[26]= {0.0122293, 0.0104803, 0.0103955, 0.0115533, 0.0118063, \
0.0120648, 0.0122957, 0.011398, 0.0117426, 0.0119997}
我可以使用 DumpSave 成功地将整个图像数组保存到磁盘上:
DumpSave["thisWorks.mx", daylightImgd]
转储这个巨大的(861 meg 文件)需要不到 10 秒的时间。如果我对这些图像进行下采样,a la:
downsample[image_, f_] := Module[{w, h}, h = Dimensions[image][[1]];
w = Dimensions[image][[2]];
Table[image[[i, j]], {i, 1, h, f}, {j, 1, w, f}]]
In[26]:= daylightImgdDown = downsample[#, 4] & /@ daylightImgd;
In[27]:= Dimensions[daylightImgdDown]
Out[27]= {18, 500, 748}
In[28]:= daylightImgdDown[[1, 1]][[1 ;; 10]]
Out[28]= {0.0122293, 0.0118063, 0.0117426, 0.0119349, 0.0109443, \
0.0121632, 0.0121304, 0.00681408, 0.0101728, 0.00603242}
然后突然我不能再转储保存了;这东西永远挂起并旋转——或者至少持续很多分钟,直到我杀死它,并最大化 CPU:
In[31]:= DumpSave["broken.mx", daylightImgdDown]; (* Hangs forever *)
据我所知,一切都应该是这样:下采样的图像具有正确的尺寸;你可以通过 ArrayPlot 绘制它们,一切看起来都很棒;手动列出第一行看起来不错。简而言之,一切看起来与非下采样图像相同,但在较小的数据集上 DumpSave 挂起。
帮助?
更新:对迈克尔的回答的评论
哇。感谢您非常彻底的回答,它不仅回答了我的问题,还教会了我一些外围知识。
供您参考,打包问题比用您的 downsample[] 替换我的 downsample[] 稍微棘手一些。由于我试图转储的数组是一个包含 18 个图像的数组(换句话说,是一个 3d 数组),并且由于我通过 Map 运算符应用下采样,因此 3d 数组的压缩性为 false(根据 PackedArrayQ) )使用您的任何一个下采样重写。
但是,如果我获取这些应用程序的输出,然后打包生成的 3d 数组,那么整个 3d 数组都会被打包,并且只有这样我才能 DumpSave 它。但奇怪的是,正如 ByteCount 报道的那样,最后的打包过程虽然对于成功的 DumpSave 是必要的,但似乎几乎没有改变大小。也许这在代码中更容易:
In[42]:= downsample3[image_, f_] :=
Module[{w, h}, h = Dimensions[image][[1]];
w = Dimensions[image][[2]];
Developer`ToPackedArray@Table[image[[i, j]], {i, 1, h, f}, {j, 1, w, f}]]
In[43]:= daylightImgdDown = downsample3[#, downsampleSize] & /@ daylightImgd;
In[44]:= ByteCount[daylightImgdDown]
Out[44]= 53966192
In[45]:= Developer`PackedArrayQ[daylightImgdDown]
Out[45]= False
In[46]:= dd = Developer`ToPackedArray[daylightImgdDown];
In[47]:= Developer`PackedArrayQ[dd]
Out[47]= True
In[48]:= ByteCount[dd]
Out[48]= 53963844
In[49]:= DumpSave["daylightImgdDown.mx", dd]; (* works now! *)
再次非常感谢。
I have a very large array of floating point image data:
In[25]:= Dimensions[daylightImgd]
Out[25]= {18, 2002, 2989}
In[26]:= daylightImgd[[1, 1]][[1 ;; 10]]
Out[26]= {0.0122293, 0.0104803, 0.0103955, 0.0115533, 0.0118063, \
0.0120648, 0.0122957, 0.011398, 0.0117426, 0.0119997}
I can save this whole image array to disk successfully using DumpSave a la:
DumpSave["thisWorks.mx", daylightImgd]
Dumping this giant (861 meg file) takes less than 10 seconds. If I downsample these images, a la:
downsample[image_, f_] := Module[{w, h}, h = Dimensions[image][[1]];
w = Dimensions[image][[2]];
Table[image[[i, j]], {i, 1, h, f}, {j, 1, w, f}]]
In[26]:= daylightImgdDown = downsample[#, 4] & /@ daylightImgd;
In[27]:= Dimensions[daylightImgdDown]
Out[27]= {18, 500, 748}
In[28]:= daylightImgdDown[[1, 1]][[1 ;; 10]]
Out[28]= {0.0122293, 0.0118063, 0.0117426, 0.0119349, 0.0109443, \
0.0121632, 0.0121304, 0.00681408, 0.0101728, 0.00603242}
Then suddenly I can't dumpsave anymore; the thing hangs and spins forever -- or at least for many minutes, till I kill it, and maxes CPU:
In[31]:= DumpSave["broken.mx", daylightImgdDown]; (* Hangs forever *)
So far as I can determine, everything is as it should be: the downsampled images have the right dimensions; you can plot them via ArrayPlot and everything looks great; manually listing the first row looks fine. Everything, in short, appears the same as with the non-down-sampled images, yet on the much smaller dataset DumpSave hangs.
Help?
UPDATE: Comments on Michael's answer
Wow. Thank you for the extremely thorough answer, which not only answered my question but taught me a bunch of peripheral stuff, too.
For your reference, the issue of packed-ness was a little trickier than just replacing my downsample[] with one of yours. Since the array I was trying to dump is an array of 18 images - a 3d array, in other words - and since I'm applying the downsampling via the Map operator, the packed-ness of the 3d array is false (according to PackedArrayQ) using either of your downsample rewrites.
However, if I take the output of those applications, and then pack the resultant 3d array, then the whole 3d array is packed, and only then can I DumpSave it. Weirdly, though, this final process of packing, while necessary for a successful DumpSave, barely seems to alter the size, as reported by ByteCount. Maybe this is easier in code:
In[42]:= downsample3[image_, f_] :=
Module[{w, h}, h = Dimensions[image][[1]];
w = Dimensions[image][[2]];
Developer`ToPackedArray@Table[image[[i, j]], {i, 1, h, f}, {j, 1, w, f}]]
In[43]:= daylightImgdDown = downsample3[#, downsampleSize] & /@ daylightImgd;
In[44]:= ByteCount[daylightImgdDown]
Out[44]= 53966192
In[45]:= Developer`PackedArrayQ[daylightImgdDown]
Out[45]= False
In[46]:= dd = Developer`ToPackedArray[daylightImgdDown];
In[47]:= Developer`PackedArrayQ[dd]
Out[47]= True
In[48]:= ByteCount[dd]
Out[48]= 53963844
In[49]:= DumpSave["daylightImgdDown.mx", dd]; (* works now! *)
Again, thanks very much.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在没有实际数据的情况下,有根据的猜测是,大数组 DumpSave 速度快的原因是因为它是所谓的“压缩数组”,即机器大小的浮点数数组在 Mathematica 中有一个非常有效的表示。您的
downsample
函数(由于使用了Table
)不会返回压缩数组,该数组在内存中要大得多,甚至可能比下采样 4 倍后的原始数组还要大。ByteCount
可能是说明性的。您可以使用
PackedArrayQ
检查打包数组的性质 并尝试使用ToPackedArray
,两者都可以在
Developer
上下文中找到。如果我的猜测是正确的话,有两种解决方案。一种是使用
ToPackedArray
,如图所示:更好的方法是简单地用
Take
替换Table
的使用,它应该返回一个打包数组情况下,作为一个额外的好处,它比使用Table
快得多。您可能还对 Mathematica 7 中的所有新图像处理功能感兴趣。
希望有帮助!
Without the actual data, an educated guess is that the reason the large array
DumpSave
s quickly is because it is a so-called "packed array", that is, an array of machine-sized floating point numbers that has a very efficient representation in Mathematica. Yourdownsample
function (due to the use ofTable
) does not return a packed array, which is much larger in memory, potentially larger than even the original array after being downsampled 4X.ByteCount
might be illustrative there.You can check for packed-array-ness with
PackedArrayQ
and attempt to pack an unpacked array withToPackedArray
, both found in theDeveloper
context.There are two solutions, if my guess is correct. One is to use
ToPackedArray
as shown:Even better is to simply replace your use of
Table
withTake
, which should return a packed array in this case, and as an added bonus be a lot faster than usingTable
.You might also be interested in all the new image processing functionality in Mathematica 7.
Hope that helps!