使用read()的程序进入无限循环

发布于 2024-10-18 14:07:20 字数 2661 浏览 1 评论 0原文

1oid ReadBinary(char *infile,HXmap* AssetMap)
{
    int fd; 
   size_t bytes_read, bytes_expected = 100000000*sizeof(char); 
   char *data;

   if ((fd = open(infile,O_RDONLY)) < 0) 
      err(EX_NOINPUT, "%s", infile);


   if ((data = malloc(bytes_expected)) == NULL)
      err(EX_OSERR, "data malloc");

   bytes_read = read(fd, data, bytes_expected);

   if (bytes_read != bytes_expected) 
      printf("Read only %d of %d bytes %d\n", \
         bytes_read, bytes_expected,EX_DATAERR);

   /* ... operate on data ... */
    printf("\n");
    int i=0;
    int counter=0;
    char ch=data[0];
    char message[512];
    Message* newMessage;
    while(i!=bytes_read)
    {

        while(ch!='\n')
        {
        message[counter]=ch;
        i++;
        counter++;
        ch =data[i];
        }
    message[counter]='\n';
    message[counter+1]='\0';
//---------------------------------------------------
    newMessage = (Message*)parser(message);
    MessageProcess(newMessage,AssetMap);
//--------------------------------------------------    
    //printf("idNUM %e\n",newMessage->idNum);
    free(newMessage);
    i++;
    counter=0;
    ch =data[i];
    }
   free(data);  

}

在这里,我用 malloc 分配了 100MB 的数据,并传递了一个足够大(不是 500MB)的文件,大小约为 926KB。当我传递小文件时,它会像魅力一样读取并退出,但是当我传递足够大的文件时,程序会执行到某个时刻,然后挂起。我怀疑它要么进入了无限循环,要么存在内存泄漏。

编辑为了更好地理解,我删除了所有不必要的函数调用,并检查当给定大文件作为输入时会发生什么。我已附上修改后的代码

void ReadBinary(char *infile,HXmap* AssetMap)
{
    int fd; 
   size_t bytes_read, bytes_expected = 500000000*sizeof(char); 
   char *data;

   if ((fd = open(infile,O_RDONLY)) < 0) 
      err(EX_NOINPUT, "%s", infile);


   if ((data = malloc(bytes_expected)) == NULL)
      err(EX_OSERR, "data malloc");

   bytes_read = read(fd, data, bytes_expected);

   if (bytes_read != bytes_expected) 
      printf("Read only %d of %d bytes %d\n", \
         bytes_read, bytes_expected,EX_DATAERR);

   /* ... operate on data ... */
    printf("\n");
    int i=0;
    int counter=0;
    char ch=data[0];
    char message[512];
    while(i<=bytes_read)
    {

        while(ch!='\n')
        {
        message[counter]=ch;
        i++;
        counter++;
        ch =data[i];
        }
    message[counter]='\n';
    message[counter+1]='\0';
    i++;
    printf("idNUM \n");
    counter=0;
    ch =data[i];
    }
   free(data);  

}

看起来是这样,它打印了很多 idNUM,然后 poof segmentation failure

我认为这是一个有趣的行为,对我来说,内存似乎存在一些问题

进一步编辑我改回了i!=bytes_read它没有给出分段错误。当我检查 i<=bytes_read 时,它超出了内循环的限制。(由 gdb 提供)

1oid ReadBinary(char *infile,HXmap* AssetMap)
{
    int fd; 
   size_t bytes_read, bytes_expected = 100000000*sizeof(char); 
   char *data;

   if ((fd = open(infile,O_RDONLY)) < 0) 
      err(EX_NOINPUT, "%s", infile);


   if ((data = malloc(bytes_expected)) == NULL)
      err(EX_OSERR, "data malloc");

   bytes_read = read(fd, data, bytes_expected);

   if (bytes_read != bytes_expected) 
      printf("Read only %d of %d bytes %d\n", \
         bytes_read, bytes_expected,EX_DATAERR);

   /* ... operate on data ... */
    printf("\n");
    int i=0;
    int counter=0;
    char ch=data[0];
    char message[512];
    Message* newMessage;
    while(i!=bytes_read)
    {

        while(ch!='\n')
        {
        message[counter]=ch;
        i++;
        counter++;
        ch =data[i];
        }
    message[counter]='\n';
    message[counter+1]='\0';
//---------------------------------------------------
    newMessage = (Message*)parser(message);
    MessageProcess(newMessage,AssetMap);
//--------------------------------------------------    
    //printf("idNUM %e\n",newMessage->idNum);
    free(newMessage);
    i++;
    counter=0;
    ch =data[i];
    }
   free(data);  

}

Here, I have allocated 100MB of data with malloc, and passed a file big enough(not 500MB) size of 926KB about. When I pass small files, it reads and exits like a charm, but when I pass a big enough file, the program executes till some point after which it just hangs. I suspect it either entered an infinite loop, or there is memory leak.

EDIT For better understanding I stripped away all unnecessary function calls, and checked what happens, when given a large file as input. I have attached the modified code

void ReadBinary(char *infile,HXmap* AssetMap)
{
    int fd; 
   size_t bytes_read, bytes_expected = 500000000*sizeof(char); 
   char *data;

   if ((fd = open(infile,O_RDONLY)) < 0) 
      err(EX_NOINPUT, "%s", infile);


   if ((data = malloc(bytes_expected)) == NULL)
      err(EX_OSERR, "data malloc");

   bytes_read = read(fd, data, bytes_expected);

   if (bytes_read != bytes_expected) 
      printf("Read only %d of %d bytes %d\n", \
         bytes_read, bytes_expected,EX_DATAERR);

   /* ... operate on data ... */
    printf("\n");
    int i=0;
    int counter=0;
    char ch=data[0];
    char message[512];
    while(i<=bytes_read)
    {

        while(ch!='\n')
        {
        message[counter]=ch;
        i++;
        counter++;
        ch =data[i];
        }
    message[counter]='\n';
    message[counter+1]='\0';
    i++;
    printf("idNUM \n");
    counter=0;
    ch =data[i];
    }
   free(data);  

}

What looks like is, it prints a whole lot of idNUM's and then poof segmentation fault

I think this is an interesting behaviour, and to me it looks like there is some problem with memory

FURTHER EDIT I changed back the i!=bytes_read it gives no segmentation fault. When I check for i<=bytes_read it blows past the limits in the innerloop.(courtesy gdb)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

原来分手还会想你 2024-10-25 14:07:20

最明显的问题是:

    while(ch!='\n')
    {
    message[counter]=ch;
    i++;
    counter++;
    ch =data[i];
    }

除非文件(或您刚刚读取的块)的最后一个字符是 \n,否则您将超出 data< 的末尾/code> 数组,很可能会一路破坏堆栈(因为您没有检查对 message 的写入是否在范围内)。

The most glaring problem is this:

    while(ch!='\n')
    {
    message[counter]=ch;
    i++;
    counter++;
    ch =data[i];
    }

Unless the last character of the file (or the block that you've just read) is \n, you'll go past the end of the data array, most probably smashing the stack along the way (since you're not checking whether your write to message is within bounds).

夜血缘 2024-10-25 14:07:20

尝试以下循环。基本上,它会重构您的实现,因此只有一个地方 i 会递增。有两个地方才是造成你麻烦的原因。

#include <stdio.h>
#include <string.h>

int main()
{
    const char* data = "First line\nSecond line\nThird line";
    unsigned int bytes_read = strlen(data);

    unsigned int i = 0;
    unsigned int counter = 0;
    char message[512];

    while (i < bytes_read)
    {
        message[counter] = data[i];
        ++counter;
        if (data[i] == '\n')
        {
            message[counter] = '\0';
            printf("%s", message);
            counter = 0;
        }
        ++i;
    }

    // If data didn't end with a newline
    if (counter)
    {
        message[counter] = '\0';
        printf("%s\n", message);
    }

    return 0;
}

或者,您可以采用“不要重新发明轮子”的方法并使用标准 strtok 调用:

#include <stdio.h>
#include <string.h>

int main()
{
    char data[] = "First line\nSecond line\nThird line";
    char* message = strtok(data, "\n");

    while (message)
    {
        printf("%s\n", message);
        message = strtok(NULL, "\n");
    }

        return 0;
}

Try the following loop. Basically, it refactors your implementation so there is only one place where i is incremented. Having two places is what's causing your trouble.

#include <stdio.h>
#include <string.h>

int main()
{
    const char* data = "First line\nSecond line\nThird line";
    unsigned int bytes_read = strlen(data);

    unsigned int i = 0;
    unsigned int counter = 0;
    char message[512];

    while (i < bytes_read)
    {
        message[counter] = data[i];
        ++counter;
        if (data[i] == '\n')
        {
            message[counter] = '\0';
            printf("%s", message);
            counter = 0;
        }
        ++i;
    }

    // If data didn't end with a newline
    if (counter)
    {
        message[counter] = '\0';
        printf("%s\n", message);
    }

    return 0;
}

Or, you could take the "don't reinvent the wheel" approach and use a standard strtok call:

#include <stdio.h>
#include <string.h>

int main()
{
    char data[] = "First line\nSecond line\nThird line";
    char* message = strtok(data, "\n");

    while (message)
    {
        printf("%s\n", message);
        message = strtok(NULL, "\n");
    }

        return 0;
}
撩人痒 2024-10-25 14:07:20

在您使用的系统上,500,000,000 是否可能大于最大的 size_t?如果是这样,bytes_expected 可能会滚动到某个较小的值。然后 bytes_read 也会效仿,最终你得到的数据块比你实际预期的要小。结果是,对于大数据,数据的最后一个字符不太可能是“\n”,因此您在内部循环中直接跳过它并开始访问数据末尾之外的字符。出现段错误。

Is it possible that on the system you're using, 500,000,000 is larger than the largest size_t? If so, bytes_expected may be rolling over to some smaller value. Then bytes_read is following suit, and you're ending up taking a smaller chunk of data than you actually expect. The result would be that for large data, the last character of data is unlikely to be a '\n', so you blow right past it in that inner loop and start accessing characters beyond the end of data. Segfault follows.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文