结构与类的内存开销

发布于 2025-01-04 00:12:54 字数 874 浏览 0 评论 0原文

我正在编写一个应用程序,它将创建数千个小对象并将它们递归地存储在数组中。我所说的“递归”是指 K 的每个实例都会有一个 K 实例数组,该数组将有一个 K 实例数组,依此类推,这个数组 + 一个 int 字段是唯一的属性 + 一些方法。我发现即使是少量数据(大约 1MB),内存使用量也会增长得非常快,当我正在处理的数据大约为 10MB 时,我会收到“OutOfMemoryException”,更不用说当它更大时(我有 4GB RAM) :)。那么你建议我做什么?我想,如果我创建单独的类 V 来处理这些对象,那么 K 的实例将只有 K 的数组 + 一个整数字段,并使 K 成为一个结构体,而不是一个类,它应该稍微优化一下 -没有垃圾收集之类的东西......但这有点挑战,所以在我开始完全重写之前我宁愿问你这是否是一个好主意:)。

编辑: 好的,一些抽象代码

public void Add(string word) {
    int i;
    string shorter;

    if (word.Length > 0) {
        i = //something, it's really irrelevant

        if (t[i] == null) {
            t[i] = new MyClass();
        }

        shorterWord = word.Substring(1); 

        //end of word
        if(shorterWord.Length == 0) {
            t[i].WordEnd = END;
        }

        //saving the word letter by letter
        t[i].Add(shorterWord);
        }
    }
}

I'm writing an app that will create thousands of small objects and store them recursively in array. By "recursively" I mean that each instance of K will have an array of K instances which will have and array of K instances and so on, and this array + one int field are the only properties + some methods. I found that memory usage grows very fast for even small amount of data - about 1MB), and when the data I'm processing is about 10MB I get the "OutOfMemoryException", not to mention when it's bigger (I have 4GB of RAM) :). So what do you suggest me to do? I figured, that if I'd create separate class V to process those objects, so that instances of K would have only array of K's + one integer field and make K as a struct, not a class, it should optimize things a bit - no garbage collection and stuff... But it's a bit of a challenge, so I'd rather ask you whether it's a good idea, before I start a total rewrite :).

EDIT:
Ok, some abstract code

public void Add(string word) {
    int i;
    string shorter;

    if (word.Length > 0) {
        i = //something, it's really irrelevant

        if (t[i] == null) {
            t[i] = new MyClass();
        }

        shorterWord = word.Substring(1); 

        //end of word
        if(shorterWord.Length == 0) {
            t[i].WordEnd = END;
        }

        //saving the word letter by letter
        t[i].Add(shorterWord);
        }
    }
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

冷血 2025-01-11 00:12:54

对我来说,在深入研究这个问题时,我已经有了以下假设(它们可能不准确;对于程序员来说,我已经老了)。类会消耗额外的内存,因为需要引用来寻址它。存储引用,32 位编译需要一个 Int32 大小的指针。始终分配在堆上(不记得 C++ 是否有其他可能性,我敢说是?)

简短的答案,在本文中找到,对象有 12 字节基本占用空间 + 4 可能未使用的字节,具体取决于您的类(毫无疑问)与填充有关)。

http://www.codeproject.com/ Articles/231120/Reducing-memory-footprint-and-object-instance-size

您会遇到的其他问题是数组也有开销。一种可能性是将您自己的偏移量管理到一个或多个更大的数组中。这反过来又更接近于更高效的语言更适合的事情。

我不确定是否有库可以有效地为小对象提供存储。可能是。

我的看法是,使用结构,在大数组中管理自己的偏移量,并使用适当的打包指令(如果它对您有用的话)(尽管我怀疑这会在每次处理不均匀打包数据时在运行时花费一些额外的指令)

[StructLayout(LayoutKind.Sequential, Pack = 1)]

For me already when researching deeper into this I had the following assumptions (they may be inexact; i'm getting old for a programmer). A class has extra memory consumption because a reference is required to address it. Store the reference and an Int32 sized pointer is needed on a 32bit compile. Allocated always on the heap (can't remember if C++ has other possibilities, i would venture yes?)

The short answer, found in this article, Object has a 12bytes basic footprint + 4 possibly unused bytes depending on your class (has no doubt something to do with padding).

http://www.codeproject.com/Articles/231120/Reducing-memory-footprint-and-object-instance-size

Other issues you'll run into is Arrays also have an overhead. A possibility would be to manage your own offset into a larger array or arrays. Which in turn is getting closer to something a more efficient language would be better suited for.

I'm not sure if there are libraries that may provide Storage for small objects in an efficient manner. Probably are.

My take on it, use Structs, manage your own offset in a large array, and use proper packing instructions if it serves you (although i suspect this comes at a cost at runtime of a few extra instructions each time you address unevenly packed data)

[StructLayout(LayoutKind.Sequential, Pack = 1)]
请远离我 2025-01-11 00:12:54

你的堆栈正在爆炸。

迭代地而不是递归地进行。

你不是把系统堆栈炸毁,而是把代码堆栈炸毁,10K 函数调用会把它从水里炸出来。

您需要适当的尾递归,这只是一种迭代黑客。

Your stack is blowing up.

Do it iteratively instead of recursively.

You're not blowing the system stack up, your blowing the code stack up, 10K function calls will blow it out of the water.

You need proper tail recursion, which is just an iterative hack.

枕梦 2025-01-11 00:12:54

确保系统中有足够的内存。超过 100mb+ 等。这实际上取决于您的系统。您正在查看的是链接列表、递归对象。如果继续递归,它将达到内存限制并且将抛出 nomemoryException。确保跟踪任何程序的内存使用情况。没有什么是无限的,尤其是记忆。如果内存有限,请将其保存到磁盘。

看起来您的代码中有无限递归并且抛出内存不足。检查代码。递归代码中应该有开始和结束。否则,它在某个时刻将超过 10 TB 内存。

Make sure you have enough memory in your system. Over 100mb+ etc. It really depends on your system. Linked list, recursive objects is what you are looking at. If you keep recursing, it is going to hit the memory limit and nomemoryexception will be thrown. Make sure you keep track of the memory usage on any program. Nothing is unlimited, especially memory. If memory is limited, save it to a disk.

Looks like there is infinite recursion in your code and out of memory is thrown. Check the code. There should be start and end in recursive code. Otherwise it will go over 10 terrabyte memory at some point.

方圜几里 2025-01-11 00:12:54

您可以使用更好的数据结构
即每个字母可以是一个字节(a-0,b-1 ...)。每个单词片段都可以被索引,尤其是子字符串 - 你应该可以显着减少内存(尽管性能损失)

You can use a better data structure
i.e. each letter can be a byte (a-0, b-1 ... ). each word fragment can be in indexed also especially substrings - you should get away with significantly less memory (though a performance penalty)

软的没边 2025-01-11 00:12:54

只需列出您的递归算法并清理变量名称即可。如果你正在进行 BFS 类型的遍历并将所有对象保留在内存中,你将耗尽内存。例如本例,将其替换为DFS。

编辑 1:

您可以通过估计将生成多少项然后一次性分配那么多内存来加速算法。随着算法的进行,填满分配的内存。这减少了碎片和重新分配。全数组复制操作。
尽管如此,在完成对这些生成的单词的操作后,您应该将它们从数据结构中删除,以便可以对它们进行 GC 处理,这样您就不会耗尽内存。

Just list your recursive algorithm and sanitize variable names. If you are doing BFS type of traversal and keep all objects in memory, you will run out of mem. For example, in this case, replace it with DFS.

Edit 1:

You can speed up the algo by estimating how many items you will generate then allocate that much memory at once. As the algo progresses, fill up the allocated memory. This reduces fragmentation and reallocation & copy-on-full-array operations.
Nonetheless, after you are done operating on these generated words you should delete them from your datastructure so they can be GC-ed so you don't run out of mem.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文