引发 std::bad_alloc 的其他可能原因
我正在开发一个相当大的 SIP 电话应用程序,偶尔当我们在重调用负载下使用集成 Web UI(使用 tntnet 编写)时,程序将由于抛出 std::bad_alloc 而退出。有数百个线程正在使用(每个活动调用 3 个线程),因此导致异常的代码位置相当随机,但总是在使用 GUI 之后。
现在,我明白 std::bad_alloc 在内存不足时可能会被抛出,但在这种情况下并非如此。我还认为,当存在堆损坏时,它可能会被抛出,我仍在寻找它可能在代码库中的位置。
但我的问题是,除了内存不足或堆损坏之外,还有其他原因导致 std::bad_alloc 被抛出吗?我在 Linux 上使用 GNU g++。
I am working on quite a large SIP telephony application and occasionally when we use the integrated web UI (written using tntnet) under heavy call load the program will exit due to a std::bad_alloc being thrown. There are hundreds of threads in use (3 per active call), so the location of the code causing the exception is quite random but is always after using the GUI.
Now, I understand that std::bad_alloc can be thrown when out of memory, which is not the case in this situation. I also am thinking that it can be thrown when there is heap corruption, which I am still looking for where ever it may be in the code base.
But my question is, are there any other reasons that std::bad_alloc will be thrown other than out of memory or heap corruption? I am using GNU g++ on Linux.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
最有可能的是,你真的记不清了。这是一个极其罕见的堆损坏错误,它始终导致仅抛出 bad_alloc。这就像以外科手术般的精确度进行涂鸦。
代码中可能存在分配大量内存的错误。但是您可能会期望在该代码中抛出异常,至少在很大一部分时间是这样。事实上,例外情况来自许多不同的地方,这一点与此相悖。
严重的碎片可能会导致问题,特别是对于
malloc
实现较差的平台。这种情况很少见,但确实会发生。我会立即做一件事——捕获异常并调用一个保存
/proc/self/maps
副本的函数。这将使您很好地了解进程的峰值内存使用情况。您可以判断它是否接近任何平台、策略或硬件限制。Most likely, you really are out of memory. It would be an extremely rare heap corruption bug that consistently caused only bad_alloc to be thrown. That would be like scribbling with surgical precision.
It is possible there's simply a bug in the code that allocates a massive amount of memory. But you would expect the exception to be thrown, at least a large fraction of the time, in that very code. The fact that the exception is coming from a number of different places weighs against that.
Severe fragmentation can cause a problem, particularly for platforms with poor implementations of
malloc
. That's rare, but it does happen.One thing I'd do immediately -- catch the exception and call a function that saves a copy of
/proc/self/maps
. That will give you a good idea of the peak memory usage of the process. You can judge if it's anywhere near any platform, policy, or hardware limitations.在 Linux 上,当前的地址空间限制可用于人为地限制进程可以使用的内存量。您可以使用
setrlimit(RLIMIT_AS, ...)
手动设置。也可以使用ulimit -v
在bashrc
中为整个 shell 设置此值。这也可以在/etc/security/limits.conf
中为整个系统进行设置。我不确定,在某个地方甚至可能有一个 /proc/sys 条目。如果达到地址空间限制,您的进程在尝试分配更多内存时将抛出 std::bad_alloc 。在 64 位系统上,这可能是一个很好的“安全措施”,可确保不良应用程序或库不会耗尽可用内存并导致系统进入交换状态或完全停止工作。确保程序本身没有在某个地方设置它,并确保环境的其余部分也没有设置它。您只需在程序中间的某个位置插入一些代码来调用 getrlimit(RLIMIT_AS, ...) 即可确保它没有潜入某个地方。
一个可能更常见的原因(当然,除了实际耗尽内存之外)是无符号整数环绕情况,其中 uin32_t 或 uint64_t 用于分配内存,但为 0 并从中减去 1,导致非常大请求分配(在 64 位中,这将是数千 PB)。
无论如何,追踪此问题的最佳方法是使用 GDB。如果您的应用程序根本不使用异常(因此根本没有“catch”语句),那么您可以启用核心文件(
ulimit -c unlimited
)。下次程序崩溃时,操作系统将生成一个核心文件并将其加载到 GDB 中,并立即向您提供回溯,显示程序崩溃的位置。如果您有几个(但不是很多)地方使用了 try 并捕获了这些错误的分配,除了在调试此问题时将它们注释掉之外,您还可以在 GDB 中运行应用程序并使用 catch throw< /code> 命令让 GDB 在每次抛出异常时中断。要使其中任何一个起作用,切勿使用
-fomit-frame-pointer
进行编译,而始终使用-ggdb
进行编译(即使使用-O3
时) 。On linux the current Address Space limit can be used to artificially limit the amount of memory a process can use. You can manually set this with
setrlimit(RLIMIT_AS, ...)
. This can also be set for an entire shell inbashrc
usingulimit -v
. This can also be set for the entire system in/etc/security/limits.conf
. There might even be a /proc/sys entry for this someplace, I'm not sure.If the address space limit is reached, your process will throw an std::bad_alloc when trying to allocate more memory. On a 64-bit system this can be a nice "safety" to make sure a bad application or library doesn't run away with the available memory and make the system go to swap or stop working altogether. Make sure the program doesn't set this itself someplace, and make sure the rest of the environment hasn't set it either. You can just insert some code in the middle of the program someplace to call
getrlimit(RLIMIT_AS, ...)
to be sure it hasn't snuck in someplace.A perhaps more common cause (other than actually running out of memory, of course) is an unsigned integer wrap around case, where a uin32_t or uint64_t is used to allocate memory but was 0 and had 1 subtracted from it, leading to a very large request allocation (in 64-bits that would be many thousands of petabytes).
In any case, the best ways to track this down are with GDB. If your application doesn't use exceptions at all (and thus has no "catch" statements at all) then you can enable core files (
ulimit -c unlimited
). The next time the program crashes, the OS will generate a core file and loading it up in GDB will immediately give you a backtrace showing you where the program crashed.If you have a few (but not many) places where try is used and its catching these bad allocs, other than just commenting them out while you debug this problem, you can run the application in GDB and use the
catch throw
command to have GDB break every time an exception is thrown. For either of these to work, never compile with-fomit-frame-pointer
and always compile (even when using-O3
) with-ggdb
.