内存错误和列表限制？

发布于 2024-10-29 20:30:21 字数 959 浏览 0 评论 0原文

我需要出于科学目的生成很大的（非常）矩阵（马尔可夫链）。我执行微积分，将其放入 20301 个元素的列表中（=矩阵的一行）。我需要内存中的所有这些数据才能继续下一个马尔可夫步骤，但如果需要，我可以将它们存储在其他地方（例如文件），即使它会减慢我的马尔可夫链演练。我的电脑（科学实验室）：Bi-xenon 6核/12线程，12GB内存，操作系统：win64

  Traceback (most recent call last):
  File "my_file.py", line 247, in <module>
    ListTemp.append(calculus)
MemoryError

微积分结果示例：9.233747520008198e-102（是的，超过1/9000）

存储第19766个元素时出现错误：

ListTemp[19766]
1.4509421012263216e-103

如果我更进一步

Traceback (most recent call last):
  File "<pyshell#21>", line 1, in <module>
    ListTemp[19767]
IndexError: list index out of range

，那么这个列表在 19767 循环处有一个内存错误。

问题：

列表有内存限制吗？是“按列表限制”还是 “每个脚本的全局限制”？
如何绕过这些限制？有什么可能性吗？
使用 numpy、python64 有帮助吗？什么他们有内存限制吗？什么关于其他语言？

原文

I need to produce large and big (very) matrices (Markov chains) for scientific purposes. I perform calculus that I put in a list of 20301 elements (=one row of my matrix). I need all those data in memory to proceed next Markov step but i can store them elsewhere (eg file) if needed even if it will slow my Markov chain walk-through. My computer (scientific lab): Bi-xenon 6 cores/12threads each, 12GB memory, OS: win64

  Traceback (most recent call last):
  File "my_file.py", line 247, in <module>
    ListTemp.append(calculus)
MemoryError

Example of calculus results: 9.233747520008198e-102 (yes, it's over 1/9000)

The error is raised when storing the 19766th element:

ListTemp[19766]
1.4509421012263216e-103

If I go further

Traceback (most recent call last):
  File "<pyshell#21>", line 1, in <module>
    ListTemp[19767]
IndexError: list index out of range

So this list had a memory error at the 19767 loop.

Questions:

Is there a memory limit to a list?
Is it a "by-list limit" or a
"global-per-script limit"?
How to bypass those limits?
Any possibilites in mind?
Will it help to use numpy, python64? What
are the memory limits with them? What
about other languages?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

ら栖息 2024-11-05 20:30:21

首先，请参阅 Python 数组可以达到多大？和 < a href="https://stackoverflow.com/questions/1697557/numpy-problem-with-long-arrays">Numpy，长数组的问题

其次，唯一真正的限制来自于您的内存量以及您的系统如何存储内存引用。每个列表没有限制，因此 Python 会一直运行，直到内存耗尽。两种可能性：

如果您运行在较旧的操作系统或强制进程使用有限内存的操作系统上，则可能需要增加 Python 进程可以访问的内存量。
使用分块将列表分开。例如，执行列表的前 1000 个元素，pickle 并将它们保存到磁盘，然后执行接下来的 1000 个元素。要使用它们，请一次取消 pickle 一个块，这样就不会耗尽内存。这本质上与数据库用来处理超出 RAM 容量的数据的技术相同。

回复收藏 0 原文

非要怀念 2024-11-05 20:30:21

您看到的 MemoryError 异常是可用 RAM 耗尽的直接结果。这可能是由于 Windows 对每个程序 2GB 的限制 (32 位程序），或者您的计算机上缺乏可用 RAM。（此链接指向上一个问题）。

如果您使用的是 64 位 Windows 副本，您应该能够通过使用 64 位 Python 副本来扩展 2GB。

IndexError 的原因是 Python 在计算整个数组之前遇到了 MemoryError 异常。这又是一个内存问题。

为了解决这个问题，您可以尝试使用 Python 的 64 位副本，或者更好地找到一种将结果写入文件的方法。为此，请查看 numpy 的内存映射数组。

您应该能够将整套计算运行到其中一个数组中，因为实际数据将写入磁盘，而只有一小部分保存在内存中。

回复收藏 0 原文

玩套路吗 2024-11-05 20:30:21

Python 没有施加内存限制。但是，如果 RAM 不足，您将收到 MemoryError 错误。您说 list 中有 20301 个元素。对于简单数据类型（例如int）来说，这似乎太小而不会导致内存错误，但如果每个元素本身是一个占用大量内存的对象，那么您很可能会耗尽内存。

然而，IndexError 可能是因为您的 ListTemp 只有 19767 个元素（索引从 0 到 19766）而引起的，而您试图访问最后一个元素。

如果不确切知道你想要做什么，很难说你可以做什么来避免达到极限。使用 numpy 可能会有所帮助。看起来您正在存储大量数据。您可能不需要在每个阶段都存储所有内容。但不知道就不可能说。

回复收藏 0 原文

小瓶盖 2024-11-05 20:30:21

如果你想避免这个问题，你也可以使用架子。然后，您将创建与机器处理能力大小相同的文件，并且仅在必要时将它们放在 RAM 上，基本上写入 HD 并将信息分段拉回，以便您可以处理它。

创建二进制文件并检查信息是否已在其中，如果是，则创建一个局部变量来保存它，否则写入一些您认为必要的数据。

Data = shelve.open('File01')
   for i in range(0,100):
     Matrix_Shelve = 'Matrix' + str(i)
     if Matrix_Shelve in Data:
        Matrix_local = Data[Matrix_Shelve]
     else:
        Data[Matrix_Selve] = 'somenthingforlater'

希望这听起来不会太古板。

If you want to circumvent this problem you could also use the shelve. Then you would create files that would be the size of your machines capacity to handle, and only put them on the RAM when necessary, basically writing to the HD and pulling the information back in pieces so you can process it.

Create binary file and check if information is already in it if yes make a local variable to hold it else write some data you deem necessary.

Data = shelve.open('File01')
   for i in range(0,100):
     Matrix_Shelve = 'Matrix' + str(i)
     if Matrix_Shelve in Data:
        Matrix_local = Data[Matrix_Shelve]
     else:
        Data[Matrix_Selve] = 'somenthingforlater'

Hope it doesn't sound too arcaic.

回复收藏 0 原文

~没有更多了~