为什么我会遇到 C malloc 断言失败?
我正在实现一个分而治之的多项式算法,这样我就可以根据 OpenCL 实现对其进行基准测试,但我无法让 malloc
工作。当我运行该程序时,它会分配一堆内容,检查一些内容,然后将 size/2
发送给算法。然后,当我再次点击 malloc
行时,它会输出以下内容:
malloc.c:3096: sYSMALLOc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed.
Aborted
The line in question is:
int *mult(int size, int *a, int *b) {
int *out,i, j, *tmp1, *tmp2, *tmp3, *tmpa1, *tmpa2, *tmpb1, *tmpb2,d, *res1, *res2;
fprintf(stdout, "size: %d\n", size);
out = (int *)malloc(sizeof(int) * size * 2);
}
I Checked size with a fprintf
, and it is a Positive Integer (usually 50 at那个点)。我也尝试使用普通数字调用 malloc
,但仍然收到错误。我只是对正在发生的事情感到困惑,到目前为止我发现谷歌没有任何帮助。
有什么想法吗?我试图弄清楚如何编译较新的 GCC,以防出现编译器错误,但我真的很怀疑。
I am implementing a divide and conquer polynomial algorithm so I can benchmark it against an OpenCL implementation, but I can't get malloc
to work. When I run the program, it allocates a bunch of stuff, checks some things, then sends the size/2
to the algorithm. Then when I hit the malloc
line again it spits out this:
malloc.c:3096: sYSMALLOc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed.
Aborted
The line in question is:
int *mult(int size, int *a, int *b) {
int *out,i, j, *tmp1, *tmp2, *tmp3, *tmpa1, *tmpa2, *tmpb1, *tmpb2,d, *res1, *res2;
fprintf(stdout, "size: %d\n", size);
out = (int *)malloc(sizeof(int) * size * 2);
}
I checked size with a fprintf
, and it is a positive integer (usually 50 at that point). I tried calling malloc
with a plain number as well and I still get the error. I'm just stumped at what's going on, and nothing from Google I have found so far is helpful.
Any ideas what's going on? I'm trying to figure out how to compile a newer GCC in case it's a compiler error, but I really doubt it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
99.9% 的可能性是您已损坏内存(缓冲区溢出或不足、释放后写入指针、在同一指针上调用两次 free 等)。
在 Valgrind 查看您的程序在哪里做错了。
99.9% likely that you have corrupted memory (over- or under-flowed a buffer, wrote to a pointer after it was freed, called free twice on the same pointer, etc.)
Run your code under Valgrind to see where your program did something incorrect.
为了让您更好地理解为什么会发生这种情况,我想稍微扩展一下 @r-samuel-klatchko 的答案。
当您调用
malloc
时,真正发生的事情比仅仅给您一块内存来使用要复杂一些。在幕后,malloc
还保留一些有关它为您提供的内存的内务信息(最重要的是它的大小),以便当您调用free
时,它知道诸如要释放多少内存。此信息通常保存在malloc
返回给您的内存位置之前。可以在互联网™上找到更详尽的信息,但是(非常)基本的想法是这样的:在此基础上(并大大简化了事情),当您调用 malloc 时,它需要获取指向可用内存的下一部分的指针。执行此操作的一种非常简单的方法是查看它放弃的前一位内存,并在内存中向下(或向上)移动
size
字节。通过此实现,在分配p1
、p2
和p3
后,您的内存最终会看起来像这样:那么,是什么导致了您的错误?
好吧,想象一下您的代码错误地写入了您分配的内存量(要么是因为您分配的内存量少于您所需要的内存量,要么是因为您在代码中的某处使用了错误的边界条件)。假设您的代码向
p2
写入了太多数据,以至于它开始覆盖p3
的size
字段中的内容。当您下次调用malloc
时,它将查看它返回的最后一个内存位置,查看其大小字段,移动到p3 + size
,然后从那里开始分配内存。然而,由于您的代码已覆盖size
,因此该内存位置不再位于先前分配的内存之后。不用说,这会造成严重破坏!因此,
malloc
的实现者放入了许多“断言”或检查,尝试进行一系列健全性检查,以在即将发生的情况(以及其他问题)时捕获它们。在您的特定情况下,这些断言被违反,因此 malloc 中止,告诉您您的代码将要做一些它实际上不应该做的事情。如前所述,这过于简单化了,但足以说明这一点。 malloc 的 glibc 实现超过 5k 行,并且已经对如何构建良好的动态内存分配机制进行了大量研究,因此不可能在 SO 答案中涵盖所有内容。希望这能让您对问题的真正原因有所了解!
To give you a better understanding of why this happens, I'd like to expand upon @r-samuel-klatchko's answer a bit.
When you call
malloc
, what is really happening is a bit more complicated than just giving you a chunk of memory to play with. Under the hood,malloc
also keeps some housekeeping information about the memory it has given you (most importantly, its size), so that when you callfree
, it knows things like how much memory to free. This information is commonly kept right before the memory location returned to you bymalloc
. More exhaustive information can be found on the internet™, but the (very) basic idea is something like this:Building on this (and simplifying things greatly), when you call
malloc
, it needs to get a pointer to the next part of memory that is available. One very simple way of doing this is to look at the previous bit of memory it gave away, and movesize
bytes further down (or up) in memory. With this implementation, you end up with your memory looking something like this after allocatingp1
,p2
andp3
:So, what is causing your error?
Well, imagine that your code erroneously writes past the amount of memory you've allocated (either because you allocated less than you needed as was your problem or because you're using the wrong boundary conditions somewhere in your code). Say your code writes so much data to
p2
that it starts overwriting what is inp3
'ssize
field. When you now next callmalloc
, it will look at the last memory location it returned, look at its size field, move top3 + size
and then start allocating memory from there. Since your code has overwrittensize
, however, this memory location is no longer after the previously allocated memory.Needless to say, this can wreck havoc! The implementors of
malloc
have therefore put in a number of "assertions", or checks, that try to do a bunch of sanity checking to catch this (and other issues) if they are about to happen. In your particular case, these assertions are violated, and thusmalloc
aborts, telling you that your code was about to do something it really shouldn't be doing.As previously stated, this is a gross oversimplification, but it is sufficient to illustrate the point. The glibc implementation of
malloc
is more than 5k lines, and there have been substantial amounts of research into how to build good dynamic memory allocation mechanisms, so covering it all in a SO answer is not possible. Hopefully this has given you a bit of a view of what is really causing the problem though!我使用 Valgrind 的替代解决方案:
我很高兴,因为我刚刚帮助我的朋友调试了一个程序。他的程序有同样的问题(
malloc()
导致中止),GDB 也有同样的错误消息。我使用 Address Sanitizer 编译了他的程序,
然后运行
gdb new
。当程序因后续malloc()
导致的SIGABRT
终止时,会打印大量有用信息:让我们看一下输出,尤其是堆栈跟踪:
第一部分表示
new.c:59
处存在无效的写入操作。该行的第二部分表示发生错误写入的内存是在
new.c:55
创建的。该行写着“就是这样”。我只花了不到半分钟就找到了让我的朋友困惑了几个小时的错误。他设法找到了故障所在,但失败的是后续的
malloc()
调用,而无法在之前的代码中发现此错误。总结:尝试GCC或Clang的
-fsanitize=address
。在调试内存问题时它非常有帮助。My alternative solution to using Valgrind:
I'm very happy because I just helped my friend debug a program. His program had this exact problem (
malloc()
causing abort), with the same error message from GDB.I compiled his program using Address Sanitizer with
And then ran
gdb new
. When the program gets terminated bySIGABRT
caused in a subsequentmalloc()
, a whole lot of useful information is printed:Let's take a look at the output, especially the stack trace:
The first part says there's a invalid write operation at
new.c:59
. That line readsThe second part says the memory that the bad write happened on is created at
new.c:55
. That line readsThat's it. It only took me less than half a minute to locate the bug that confused my friend for a few hours. He managed to locate the failure, but it's a subsequent
malloc()
call that failed, without being able to spot this error in previous code.Sum up: Try the
-fsanitize=address
of GCC or Clang. It can be very helpful when debugging memory issues.您可能在某个地方超出了分配的内存。
然后底层软件不会接收到它,直到您调用 malloc
可能有一个被 malloc 捕获的保护值被破坏。
编辑...添加了此边界检查帮助
http:// /www.lrde.epita.fr/~akim/ccmp/doc/bounds-checking.html
You are probably overrunning beyond the allocated mem somewhere.
then the underlying sw doesn't pick up on it until you call malloc
There may be a guard value clobbered that is being caught by malloc.
edit...added this for bounds checking help
http://www.lrde.epita.fr/~akim/ccmp/doc/bounds-checking.html
我收到以下消息,与您的消息类似:
使用 malloc 时,之前调用某些方法时犯了错误。在向 unsigned char 数组添加字段时更新 sizeof() 运算符之后的因子时,错误地用“+”覆盖了乘法符号“*”。
以下是导致我的情况发生错误的代码:
在稍后的另一种方法中,我再次使用 malloc 并产生了上面显示的错误消息。调用是(足够简单):
考虑在第一次调用时使用“+”号,这会导致计算错误以及之后立即初始化数组(覆盖未分配给数组的内存) ,给malloc的内存映射带来了一些混乱。因此第二次调用出错了。
I got the following message, similar to your one:
Made a mistake some method call before, when using malloc. Erroneously overwrote the multiplication sign '*' with a '+', when updating the factor after sizeof()-operator on adding a field to unsigned char array.
Here is the code responsible for the error in my case:
In another method later, I used malloc again and it produced the error message shown above. The call was (simple enough):
Think using the '+'-sign on the 1st call, which lead to mis-calculus in combination with immediate initialization of the array after (overwriting memory that was not allocated to the array), brought some confusion to malloc's memory map. Therefore the 2nd call went wrong.
我们得到这个错误是因为我们忘记乘以 sizeof(int)。请注意,malloc(..) 的参数是字节数,而不是机器字数或其他内容。
We got this error because we forgot to multiply by sizeof(int). Note the argument to malloc(..) is a number of bytes, not number of machine words or whatever.
我正在通过 Linux 将一个应用程序从 Visual C 移植到 gcc,并且遇到了同样的问题
我将相同的代码移至 Suse 发行版(在其他计算机上),并且没有任何问题。
我怀疑问题不是出在我们的程序上,而是出在我们自己的libc上。
I was porting one application from Visual C to gcc over Linux and I had the same problem with
I moved the same code to a Suse distribution (on other computer ) and I don't have any problem.
I suspect that the problems are not in our programs but in the own libc.
我遇到了同样的问题,我在循环中再次使用 malloc 来添加新的 char *string 数据。我遇到了同样的问题,但是在释放分配的内存
void free()
问题已排序i got the same problem, i used malloc over n over again in a loop for adding new char *string data. i faced the same problem, but after releasing the allocated memory
void free()
problem were sorted