Mathematica 内存不足

发布于 2024-08-09 09:30:02 字数 1516 浏览 2 评论 0原文

我正在尝试运行以下程序,该程序计算次数最多为 d 且系数仅为 +1 或 -1 的多项式的根,然后将其存储到文件中。

d = 20; n = 18000; 
f[z_, i_] := Sum[(2 Mod[Floor[(i - 1)/2^k], 2] - 1) z^(d - k), {k, 0, d}];

这里 f[z,i] 给出 z 中的多项式,带有以二进制计数的加号或减号。假设 d=2,我们将得到

f[z,1] = -z2 - z - 1
f[z,2] = -z2 - z + 1
f[z,3] = -z2 + z - 1
f[z,4] = -z2 + z + 1

DistributeDefinitions[d, n, f]

ParallelDo[ 
            Do[ 
                     root = N[Root[f[z, i], j]];
                     {a, b} = Round[n ({Re[root], Im[root]}/1.5 + 1)/2];
            {i, 1, 2^d}],
{j, 1, d}]

我意识到阅读这篇文章可能不太令人愉快,但无论如何它相对较短。我会尝试减少相关部分,但这里我真的不知道问题出在哪里。我正在计算 f[z,i] 的所有根,然后对它们进行舍入以使它们对应于 n 个网格中的点,并将该数据保存在各种文件中。

由于某种原因,Mathematica 中的内存使用量不断增加,直到填满所有内存(本机上为 6 GB);然后计算会极其缓慢地进行;这是为什么?

我不确定是什么耗尽了这里的内存 - 我唯一的猜测是文件流耗尽了内存,但事实并非如此:我尝试将数据附加到 2GB 文件,但没有明显的内存使用情况。 Mathematica 似乎完全没有理由在这里使用大量内存。

对于较小的 d 值(例如 15),行为如下:我正在运行 4 个内核。当它们都运行 ParallelDo 循环(每个一次执行 j 的值)时,内存使用量会增加,直到它们都完成该循环一次。然后,当他们下次执行该循环时,内存使用量根本不会增加。计算最终完成,一切都很好。

此外,非常重要的是,一旦计算停止,内存使用量就不会下降。 如果我开始另一次计算,则会发生以下情况:

- 如果先前的计算在内存使用量仍在增加时停止,则它会继续增加(可能需要一段时间才能再次开始增加,基本上是为了达到计算中的同一点)。

- 如果内存使用量没有增加时先前的计算停止,则它不会进一步增加。

编辑:问题似乎来自于 f 的相对复杂性 - 将其更改为一些更简单的多项式似乎可以解决问题。我认为问题可能是 Mathematica 记住了 f[z,i] 的特定值 i,但设置了 f[z,i] :=。在计算 f[z,i] 的根后,抱怨分配一开始就不存在,并且内存仍然被使用。

这真的很令人费解,因为 f 是我能想象的唯一剩下的占用内存的东西,但是在内部 Do 循环中定义 f 并在每次计算根后清除它并不能解决问题。

I'm trying to run the following program, which calculates roots of polynomials of degree up to d with coefficients only +1 or -1, and then store it into files.

d = 20; n = 18000; 
f[z_, i_] := Sum[(2 Mod[Floor[(i - 1)/2^k], 2] - 1) z^(d - k), {k, 0, d}];

Here f[z,i] gives a polynomial in z with plus or minus signs counting in binary. Say d=2, we would have

f[z,1] = -z2 - z - 1
f[z,2] = -z2 - z + 1
f[z,3] = -z2 + z - 1
f[z,4] = -z2 + z + 1

DistributeDefinitions[d, n, f]

ParallelDo[ 
            Do[ 
                     root = N[Root[f[z, i], j]];
                     {a, b} = Round[n ({Re[root], Im[root]}/1.5 + 1)/2];
            {i, 1, 2^d}],
{j, 1, d}]

I realise reading this probably isn't too enjoyable, but it's relatively short anyway. I would've tried to cut down to the relevant parts, but here I really have no clue what the trouble is. I'm calculating all roots of f[z,i], and then just round them to make them correspond to a point in a n by n grid, and save that data in various files.

For some reason, the memory usage in Mathematica creeps up until it fills all the memory (6 GB on this machine); then the computation continues extremely slowly; why is this?

I am not sure what is using up the memory here - my only guess was the stream of files used up memory, but that's not the case: I tried appending data to 2GB files and there was no noticeable memory usage for that. There seems to be absolutely no reason for Mathematica to be using large amounts of memory here.

For small values of d (15 for example), the behaviour is the following: I have 4 kernels running. As they all run through the ParallelDo loop (each doing a value of j at a time), the memory use increases, until they all finish going through that loop once. Then the next times they go through that loop, the memory use does not increase at all. The calculation eventually finishes and everything is fine.

Also, quite importantly, once the calculation stops, the memory use does not go back down.
If I start another calculation, the following happens:

-If the previous calculation stopped when memory use was still increasing, it continues to increase (it might take a while to start increasing again, basically to get to the same point in the computation).

-If the previous calculation stopped when memory use was not increasing, it does not increase further.

Edit: The issue seems to come from the relative complexity of f - changing it into some easier polynomial seems to fix the issue. I thought the problem might be that Mathematica remembers f[z,i] for specific values of i, but setting f[z,i] :=. just after calculating a root of f[z,i] complains that the assignment did not exist in the first place, and the memory is still used.

It's quite puzzling really, as f is the only remaining thing I can imagine taking up memory, but defining f in the inner Do loop and clearing it each time after a root is calculated does not solve the problem.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

世态炎凉 2024-08-16 09:30:02

哎呀,这真是一件令人讨厌的事。

发生的事情是,N 将缓存结果,以便在您再次需要它们时加快未来的计算速度。有时这绝对是你想要的,但有时它却打破了世界。幸运的是,您确实有一些选择。一种是使用 ClearSystemCache 命令,该命令正如罐头上所说的那样。在我运行非并行循环一段时间后(在感到无聊并中止计算之前),MemoryInUse 报告正在使用约 160 MiB。使用 ClearSystemCache 将其减少到大约 14 MiB。

您应该考虑做的一件事是使用 SetSystemOptions 更改缓存行为。您应该查看 SystemOptions["CacheOptions"] 以了解可能性。

编辑:对于更复杂的表达式来说,缓存会导致更大的问题,这并不奇怪。它必须将这些表达式的副本存储在某个地方,并且更复杂的表达式需要更多的内存。

Ouch, this is a nasty one.

What's going on is that N will do caching of results in order to speed up future calculations if you need them again. Sometimes this is absolutely what you want, but sometimes it just breaks the world. Fortunately, you do have some options. One is to use the ClearSystemCache command, which does just what it said on the tin. After I ran your un-parallelized loop for a little while (before getting bored and aborting the calculation), MemoryInUse reported ~160 MiB in use. Using ClearSystemCache got that down to about 14 MiB.

One thing you should look at doing, instead of calling ClearSystemCache programmatically, is to use SetSystemOptions to change the caching behavior. You should take a look at SystemOptions["CacheOptions"] to see what the possibilities are.

EDIT: It's not terribly surprising that the caching causes a bigger problem for more complex expressions. It's got to be stashing copies of those expressions somewhere, and more complex expressions require more memory.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文