为什么我的 C++ 使用 read(...) 函数后代码会导致分段错误吗?
我的应用程序在一行代码上挂起,看起来没有任何问题,但是我的 IDE 似乎在该行挂起并出现错误:
gdb/mi (24/03/09 13:36)(已退出。收到信号“SIGSEGV”。说明:分段错误。)
该代码行只是调用一个没有代码的方法。 当您有空引用时,不是分段错误吗? 如果是这样,空方法怎么会有空引用?
这段代码似乎导致了问题:
#include <sys/socket.h>
#define BUFFER_SIZE 256
char *buffer;
buffer = (char*)GetSomePointer()->SomeStackMemoryString.c_str();
int writeResult = write(socketFD, buffer, BUFFER_SIZE);
bzero(buffer, BUFFER_SIZE);
int readResult = read(socketFD, buffer, BUFFER_SIZE);
当使用 read(...)
方法的行被注释掉时,问题就消失了。
更新:
我已经更改了问题以指向实际问题,并且删除了所有不相关的代码 - 而且我还 回答了我自己的问题,以便阅读本文的人具体知道问题是什么,请阅读在说“你是个白痴!”之前我的回答。
My application is suspending on a line of code that appears to have nothing wrong with it, however my IDE appears to be suspending on that line with the error:
gdb/mi (24/03/09 13:36) (Exited. Signal 'SIGSEGV' received. Description: Segmentation fault.)
The line of code simply calls a method which has no code in it. Isn't a segmentation fault when you have a null reference? If so, how can an empty method have a null reference?
This piece of code, seems to be causing the issue:
#include <sys/socket.h>
#define BUFFER_SIZE 256
char *buffer;
buffer = (char*)GetSomePointer()->SomeStackMemoryString.c_str();
int writeResult = write(socketFD, buffer, BUFFER_SIZE);
bzero(buffer, BUFFER_SIZE);
int readResult = read(socketFD, buffer, BUFFER_SIZE);
When the line using the read(...)
method is commented out, the problem goes away.
Update:
I have changed the question to point toward the actual problem, and I have removed all the irrelevant code - and I also answered my own question so that people reading this know specifically what the issue is, please read my answer before saying "you're a moron!".
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
首先,严格来说,通过空指针或引用调用方法是未定义的行为。 但除非呼叫是虚拟的,否则它可能会成功。
如果引用/指针为 null,则虚拟调用虚拟方法(通过指针/引用,而不是使用 Class::Method() 调用方式从派生类调用)总是会失败,因为虚拟调用需要访问 vtable,并通过 null 访问 vtable指针/引用是不可能的。 所以你不能通过引用/指针调用空的虚拟方法。
要理解这一点,您需要更多地了解代码的组织方式。 对于每个非内联方法,都有一段代码段包含实现该方法的机器代码。
当非虚拟调用(无论是从派生类还是通过引用/指针的非虚拟方法)完成时,编译器确切地知道要调用哪个方法(无多态性)。 因此,它只是插入对代码的确切部分的调用,并将 this 指针作为第一个参数传递。 如果通过空指针调用,this 也将为空,但你不关心你的方法是否为空。
当虚拟调用(通过引用/指针)时,编译器不知道到底要调用哪个方法,它只知道有一个虚拟方法表,并且该表的地址存储在对象中。 为了找到要调用的方法,必须首先取消引用指针/引用,进入表,从中获取方法的地址,然后才能调用该方法。 读取该表是在运行时完成的,而不是在编译期间完成的。 如果指针/引用为空,此时您会出现分段错误。
这也解释了为什么虚拟调用不能内联。 编译器在编译期间查看源代码时根本不知道要内联哪些代码。
First, calling a method through a null pointer or reference is strictly speaking undefined behaviour. But it may succeed unless the call is virtual.
Calling virtual methods virtually (through a pointer/reference, not from the derived class with Class::Method() way of invokation) always fails if the reference/pointer is null because virtual calls require access to vtable and accessing the vtable through a null pointer/reference is impossible. So you can't call an empty virtual method through a reference/pointer.
To understand this you need to know more about how code is organized. For every non-inlined method there's a section of code segment containing the machine code implementing the method.
When a call is done non-virtually (either from a derived class or a non-virtual method through a reference/pointer) the compiler knows exactly which method to call (no polymorphism). So it just inserts a call to an exact portion of code and passes this pointer as the first parameter there. In case of calling through null pointer this will be null too, but you don't care if your method is empty.
When a call is done virtually (through a reference/pointer) the compiler doesn't know which exactly method to call, it only knows that there's a table of virtual methods and the address of the table is stored in the object. In order to find what method to call it's necessary to first dereference the pointer/reference, get to the table, get the address of method from it and only then call the method. Reading the table is done in runtime, not during compilation. If the pointer/reference is null you get segmentation fault at this point.
This also explains why virtual calls can't be inlined. The compiler simply has no idea what code to inline when it's looking at the source during compilation.
你的代码是假的:缓冲区指向某个随机的内存块。 我不确定为什么 bzero 的线路没有失败。
正确的代码是:
或者您可以使用 calloc(1, BUFFER_SIZE) 来分配一些内存(并清零)。
Your code is bogus: buffer points to some random piece of memory. I'm not sure why the line with bzero is not failing.
The correct code is:
or you can use calloc(1, BUFFER_SIZE) to get some memory allocated (and zeroed out).
如果没有代码,我能做的最好的就是疯狂猜测。 但这里是这样的:
您的“长时间运行的代码”正在写入无效的指针。 (要么是完全随机的指针,要么越过缓冲区或数组的开头/开头)。 这恰好会破坏对象的虚拟函数表 - 要么覆盖指向对象的指针,或对象的 vptr 成员,要么覆盖该类的实际全局虚拟函数表。
可以尝试的一些事情:
Without code, the best I can do is a wild guess. But here goes:
Your "long-running code" is writing to an invalid pointer. (Either a totally random pointer, or going past the beginning/start of a buffer or array). This happens to be clobbering the virtual function table for your object - either it's overwriting the pointer to the object, or the vptr member of the object, or it's overwriting the actual global virtual function table for that class.
Some things to try:
我想不出为什么空方法本身会导致这样的问题。 如果没有任何其他背景,我的第一个想法是其他地方的问题正在破坏你的记忆,而它恰好在这里以这种方式表现出来。
我们以前遇到过这样的问题,我在这个答案中写过它 这里。 同样的问题还有很多其他好的建议可能会对您有所帮助。
I cannot think of any reason why an empty method on its own would cause such a problem. Without any other context, my first though would be that a problem elsewhere is corrupting your memory and it just so happens to manifest itself in this way here.
We had that kind of a problem before, and I wrote about it in this answer here. That same question has a lot of other good advice in it too which might help you.
有可能,但不一定。 导致段错误的原因在某种程度上是特定于平台的,但它基本上意味着您的程序正在访问不应该访问的内存。 您可能需要阅读维基百科文章以更好地了解它是什么。
您可能需要检查的一件事是,空方法是否有返回类型? 我可能是错的,但如果它返回一个对象,我可以看到如果该方法实际上没有返回对象,如何在垃圾上调用复制构造函数。 这可能会导致各种奇怪的行为。
如果将其返回类型更改为 void 或返回一个值,会得到相同的结果吗?
Possibly, but not necessarily. What causes a segfault is somewhat platform-specific, but it basically means that your program is accessing memory that it shouldn't be. You might want to read the wikipedia article to get a better idea of what it is.
One thing you might check on, does the empty method have a return type? I could be wrong on this, but if it returns an object, I could see how a copy constructor can get called on garbage if the method isn't actually returning an object. This could cause all sorts of wonky behavior.
Do you get the same result if you change its return type to void or you return a value?
问题是因为
buffer
变量使用未分配的内存,当read(...)
函数将数据放入buffer
时,会导致内存损坏。通常,bzero 实际上会导致分段错误,但由于字符串被分配给内存位置,因此允许读取函数写入超过分配的内存(导致泄漏)。
此更改解决了内存泄漏:
The issue is because the
buffer
variable is using unassigned memory, which causes memory corruption when theread(...)
function puts data inbuffer
.Normally, bzero would actually cause the segmentation fault, but because a string is being assigned to the memory location, the read function was allowed to write past the allocated memory (causing the leak).
This change solves the memory leak:
您是否从基类的构造函数中调用虚拟方法? 这可能是问题所在:如果您在
Base
的构造函数中从类Base
调用纯虚拟方法,并且它仅在类Derived 中实际定义
,您可能最终会访问尚未设置的 vtable 记录,因为此时Derived
的构造函数尚未执行。Are you calling the virtual method from the constructor of a base class? That could be the problem: If you're calling a pure virtual method from class
Base
inBase
's constructor, and it is only actually defined in classDerived
, you might end up accessing a vtable record that has not yet been set, becauseDerived
's constructor has not been executed at that point.