为什么我的 C++ 使用 read(...) 函数后代码会导致分段错误吗?

发布于 2024-07-15 20:27:15 字数 845 浏览 6 评论 0原文

我的应用程序在一行代码上挂起,看起来没有任何问题,但是我的 IDE 似乎在该行挂起并出现错误:

gdb/mi (24/03/09 13:36)(已退出。收到信号“SIGSEGV”。说明:分段错误。)

该代码行只是调用一个没有代码的方法。 当您有空引用时,不是分段错误吗? 如果是这样,空方法怎么会有空引用?

这段代码似乎导致了问题:

#include <sys/socket.h>

#define BUFFER_SIZE 256

char *buffer;

buffer = (char*)GetSomePointer()->SomeStackMemoryString.c_str();
int writeResult = write(socketFD, buffer, BUFFER_SIZE);

bzero(buffer, BUFFER_SIZE);
int readResult = read(socketFD, buffer, BUFFER_SIZE);

当使用 read(...) 方法的行被注释掉时,问题就消失了。

更新:

我已经更改了问题以指向实际问题,并且删除了所有不相关的代码 - 而且我还 回答了我自己的问题,以便阅读本文的人具体知道问题是什么,请阅读在说“你是个白痴!”之前我的回答。

My application is suspending on a line of code that appears to have nothing wrong with it, however my IDE appears to be suspending on that line with the error:

gdb/mi (24/03/09 13:36) (Exited. Signal 'SIGSEGV' received. Description: Segmentation fault.)

The line of code simply calls a method which has no code in it. Isn't a segmentation fault when you have a null reference? If so, how can an empty method have a null reference?

This piece of code, seems to be causing the issue:

#include <sys/socket.h>

#define BUFFER_SIZE 256

char *buffer;

buffer = (char*)GetSomePointer()->SomeStackMemoryString.c_str();
int writeResult = write(socketFD, buffer, BUFFER_SIZE);

bzero(buffer, BUFFER_SIZE);
int readResult = read(socketFD, buffer, BUFFER_SIZE);

When the line using the read(...) method is commented out, the problem goes away.

Update:

I have changed the question to point toward the actual problem, and I have removed all the irrelevant code - and I also answered my own question so that people reading this know specifically what the issue is, please read my answer before saying "you're a moron!".

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

千紇 2024-07-22 20:27:15

首先,严格来说,通过空指针或引用调用方法是未定义的行为。 但除非呼叫是虚拟的,否则它可能会成功。

如果引用/指针为 null,则虚拟调用虚拟方法(通过指针/引用,而不是使用 Class::Method() 调用方式从派生类调用)总是会失败,因为虚拟调用需要访问 vtable,并通过 null 访问 vtable指针/引用是不可能的。 所以你不能通过引用/指针调用空的虚拟方法。

要理解这一点,您需要更多地了解代码的组织方式。 对于每个非内联方法,都有一段代码段包含实现该方法的机器代码。

当非虚拟调用(无论是从派生类还是通过引用/指针的非虚拟方法)完成时,编译器确切地知道要调用哪个方法(无多态性)。 因此,它只是插入对代码的确切部分的调用,并将 this 指针作为第一个参数传递。 如果通过空指针调用,this 也将为空,但你不关心你的方法是否为空。

当虚拟调用(通过引用/指针)时,编译器不知道到底要调用哪个方法,它只知道有一个虚拟方法表,并且该表的地址存储在对象中。 为了找到要调用的方法,必须首先取消引用指针/引用,进入表,从中获取方法的地址,然后才能调用该方法。 读取该表是在运行时完成的,而不是在编译期间完成的。 如果指针/引用为空,此时您会出现分段错误。

这也解释了为什么虚拟调用不能内联。 编译器在编译期间查看源代码时根本不知道要内联哪些代码。

First, calling a method through a null pointer or reference is strictly speaking undefined behaviour. But it may succeed unless the call is virtual.

Calling virtual methods virtually (through a pointer/reference, not from the derived class with Class::Method() way of invokation) always fails if the reference/pointer is null because virtual calls require access to vtable and accessing the vtable through a null pointer/reference is impossible. So you can't call an empty virtual method through a reference/pointer.

To understand this you need to know more about how code is organized. For every non-inlined method there's a section of code segment containing the machine code implementing the method.

When a call is done non-virtually (either from a derived class or a non-virtual method through a reference/pointer) the compiler knows exactly which method to call (no polymorphism). So it just inserts a call to an exact portion of code and passes this pointer as the first parameter there. In case of calling through null pointer this will be null too, but you don't care if your method is empty.

When a call is done virtually (through a reference/pointer) the compiler doesn't know which exactly method to call, it only knows that there's a table of virtual methods and the address of the table is stored in the object. In order to find what method to call it's necessary to first dereference the pointer/reference, get to the table, get the address of method from it and only then call the method. Reading the table is done in runtime, not during compilation. If the pointer/reference is null you get segmentation fault at this point.

This also explains why virtual calls can't be inlined. The compiler simply has no idea what code to inline when it's looking at the source during compilation.

-黛色若梦 2024-07-22 20:27:15

你的代码是假的:缓冲区指向某个随机的内存块。 我不确定为什么 bzero 的线路没有失败。

正确的代码是:

   char buffer[BUFFER_SIZE];

   bzero(buffer, BUFFER_SIZE);
   int readResult = read(socketFD, buffer, BUFFER_SIZE);

或者您可以使用 calloc(1, BUFFER_SIZE) 来分配一些内存(并清零)。

Your code is bogus: buffer points to some random piece of memory. I'm not sure why the line with bzero is not failing.

The correct code is:

   char buffer[BUFFER_SIZE];

   bzero(buffer, BUFFER_SIZE);
   int readResult = read(socketFD, buffer, BUFFER_SIZE);

or you can use calloc(1, BUFFER_SIZE) to get some memory allocated (and zeroed out).

千仐 2024-07-22 20:27:15

如果没有代码,我能做的最好的就是疯狂猜测。 但这里是这样的:

您的“长时间运行的代码”正在写入无效的指针。 (要么是完全随机的指针,要么越过缓冲区或数组的开头/开头)。 这恰好会破坏对象的虚拟函数表 - 要么覆盖指向对象的指针,或对象的 vptr 成员,要么覆盖该类的实际全局虚拟函数表。

可以尝试的一些事情:

  • 在你的班级中放置一名哨兵成员。 例如,在构造函数中将 int 初始化为已知模式(0xdeadbeef 或 0xcafebabe 很常见),并且从未更改。 在进行虚拟函数调用之前,请检查 (assert()) 它是否仍然具有正确的值。
  • 尝试使用内存调试器。 在 Linux 上,选项包括 Electric Fence (efence) 或 Valgrind。
  • 在调试器下运行你的程序(gdb 就可以了),然后查看哪里出了问题——要么在段错误发生后进行事后分析,要么在将要发生段错误的地方设置一个断点。

Without code, the best I can do is a wild guess. But here goes:

Your "long-running code" is writing to an invalid pointer. (Either a totally random pointer, or going past the beginning/start of a buffer or array). This happens to be clobbering the virtual function table for your object - either it's overwriting the pointer to the object, or the vptr member of the object, or it's overwriting the actual global virtual function table for that class.

Some things to try:

  • Put a sentinel member in your class. E.g. an int which is initialised to a known pattern (0xdeadbeef or 0xcafebabe are common) in your constructor, and never changed. Before you make the virtual function call, check (assert()) that it still has the right value.
  • Try using a memory debugger. On Linux, options include Electric Fence (efence) or Valgrind.
  • Run your program under a debugger (gdb is fine) and poke around to see what's wrong - either post-mortem after the segfault happens, or by setting a breakpoint just before the place it's going to segfault.
你爱我像她 2024-07-22 20:27:15

我想不出为什么空方法本身会导致这样的问题。 如果没有任何其他背景,我的第一个想法是其他地方的问题正在破坏你的记忆,而它恰好在这里以这种方式表现出来。

我们以前遇到过这样的问题,我在这个答案中写过它 这里。 同样的问题还有很多其他好的建议可能会对您有所帮助。

I cannot think of any reason why an empty method on its own would cause such a problem. Without any other context, my first though would be that a problem elsewhere is corrupting your memory and it just so happens to manifest itself in this way here.

We had that kind of a problem before, and I wrote about it in this answer here. That same question has a lot of other good advice in it too which might help you.

度的依靠╰つ 2024-07-22 20:27:15

当引用为空时,不是分段错误吗?

有可能,但不一定。 导致段错误的原因在某种程度上是特定于平台的,但它基本上意味着您的程序正在访问不应该访问的内存。 您可能需要阅读维基百科文章以更好地了解它是什么。

您可能需要检查的一件事是,空方法是否有返回类型? 我可能是错的,但如果它返回一个对象,我可以看到如果该方法实际上没有返回对象,如何在垃圾上调用复制构造函数。 这可能会导致各种奇怪的行为。

如果将其返回类型更改为 void 或返回一个值,会得到相同的结果吗?

Isn't a segmentation fault when you have a null reference?

Possibly, but not necessarily. What causes a segfault is somewhat platform-specific, but it basically means that your program is accessing memory that it shouldn't be. You might want to read the wikipedia article to get a better idea of what it is.

One thing you might check on, does the empty method have a return type? I could be wrong on this, but if it returns an object, I could see how a copy constructor can get called on garbage if the method isn't actually returning an object. This could cause all sorts of wonky behavior.

Do you get the same result if you change its return type to void or you return a value?

提笔书几行 2024-07-22 20:27:15

问题是因为 buffer 变量使用未分配的内存,当 read(...) 函数将数据放入 buffer 时,会导致内存损坏。

通常,bzero 实际上会导致分段错误,但由于字符串被分配给内存位置,因此允许读取函数写入超过分配的内存(导致泄漏)。

/* this causes *some* memory to be allocated, 
 * tricking bzero(...) to not SIGSEGV */
buffer = (char*)GetSomePointer()->SomeStackMemoryString.c_str();

int writeResult = write(socketFD, buffer, BUFFER_SIZE);

此更改解决了内存泄漏:

#define BUFFER_SIZE 256

// Use memory on the stack, for auto allocation and release.
char buffer[BUFFER_SIZE];

// Don't write to the buffer, just pass in the chars on their own.
string writeString = GetSomePointer()->SomeStackMemoryString;
int writeResult = write(socketFD, writeString.c_str(), writeString.length());

// It's now safe to use the buffer, as stack memory is used.
bzero(buffer, BUFFER_SIZE);
int readResult = read(socketFD, buffer, BUFFER_SIZE);

The issue is because the buffer variable is using unassigned memory, which causes memory corruption when the read(...) function puts data in buffer.

Normally, bzero would actually cause the segmentation fault, but because a string is being assigned to the memory location, the read function was allowed to write past the allocated memory (causing the leak).

/* this causes *some* memory to be allocated, 
 * tricking bzero(...) to not SIGSEGV */
buffer = (char*)GetSomePointer()->SomeStackMemoryString.c_str();

int writeResult = write(socketFD, buffer, BUFFER_SIZE);

This change solves the memory leak:

#define BUFFER_SIZE 256

// Use memory on the stack, for auto allocation and release.
char buffer[BUFFER_SIZE];

// Don't write to the buffer, just pass in the chars on their own.
string writeString = GetSomePointer()->SomeStackMemoryString;
int writeResult = write(socketFD, writeString.c_str(), writeString.length());

// It's now safe to use the buffer, as stack memory is used.
bzero(buffer, BUFFER_SIZE);
int readResult = read(socketFD, buffer, BUFFER_SIZE);
对岸观火 2024-07-22 20:27:15

您是否从基类的构造函数中调用虚拟方法? 这可能是问题所在:如果您在 Base 的构造函数中从类 Base 调用纯虚拟方法,并且它仅在类 Derived 中实际定义,您可能最终会访问尚未设置的 vtable 记录,因为此时 Derived 的构造函数尚未执行。

Are you calling the virtual method from the constructor of a base class? That could be the problem: If you're calling a pure virtual method from class Base in Base's constructor, and it is only actually defined in class Derived, you might end up accessing a vtable record that has not yet been set, because Derived's constructor has not been executed at that point.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文