使用read()的程序进入无限循环
1oid ReadBinary(char *infile,HXmap* AssetMap)
{
int fd;
size_t bytes_read, bytes_expected = 100000000*sizeof(char);
char *data;
if ((fd = open(infile,O_RDONLY)) < 0)
err(EX_NOINPUT, "%s", infile);
if ((data = malloc(bytes_expected)) == NULL)
err(EX_OSERR, "data malloc");
bytes_read = read(fd, data, bytes_expected);
if (bytes_read != bytes_expected)
printf("Read only %d of %d bytes %d\n", \
bytes_read, bytes_expected,EX_DATAERR);
/* ... operate on data ... */
printf("\n");
int i=0;
int counter=0;
char ch=data[0];
char message[512];
Message* newMessage;
while(i!=bytes_read)
{
while(ch!='\n')
{
message[counter]=ch;
i++;
counter++;
ch =data[i];
}
message[counter]='\n';
message[counter+1]='\0';
//---------------------------------------------------
newMessage = (Message*)parser(message);
MessageProcess(newMessage,AssetMap);
//--------------------------------------------------
//printf("idNUM %e\n",newMessage->idNum);
free(newMessage);
i++;
counter=0;
ch =data[i];
}
free(data);
}
在这里,我用 malloc 分配了 100MB 的数据,并传递了一个足够大(不是 500MB)的文件,大小约为 926KB。当我传递小文件时,它会像魅力一样读取并退出,但是当我传递足够大的文件时,程序会执行到某个时刻,然后挂起。我怀疑它要么进入了无限循环,要么存在内存泄漏。
编辑为了更好地理解,我删除了所有不必要的函数调用,并检查当给定大文件作为输入时会发生什么。我已附上修改后的代码
void ReadBinary(char *infile,HXmap* AssetMap)
{
int fd;
size_t bytes_read, bytes_expected = 500000000*sizeof(char);
char *data;
if ((fd = open(infile,O_RDONLY)) < 0)
err(EX_NOINPUT, "%s", infile);
if ((data = malloc(bytes_expected)) == NULL)
err(EX_OSERR, "data malloc");
bytes_read = read(fd, data, bytes_expected);
if (bytes_read != bytes_expected)
printf("Read only %d of %d bytes %d\n", \
bytes_read, bytes_expected,EX_DATAERR);
/* ... operate on data ... */
printf("\n");
int i=0;
int counter=0;
char ch=data[0];
char message[512];
while(i<=bytes_read)
{
while(ch!='\n')
{
message[counter]=ch;
i++;
counter++;
ch =data[i];
}
message[counter]='\n';
message[counter+1]='\0';
i++;
printf("idNUM \n");
counter=0;
ch =data[i];
}
free(data);
}
看起来是这样,它打印了很多 idNUM
,然后 poof segmentation failure
我认为这是一个有趣的行为,对我来说,内存似乎存在一些问题
进一步编辑我改回了i!=bytes_read
它没有给出分段错误。当我检查 i<=bytes_read
时,它超出了内循环的限制。(由 gdb 提供)
1oid ReadBinary(char *infile,HXmap* AssetMap)
{
int fd;
size_t bytes_read, bytes_expected = 100000000*sizeof(char);
char *data;
if ((fd = open(infile,O_RDONLY)) < 0)
err(EX_NOINPUT, "%s", infile);
if ((data = malloc(bytes_expected)) == NULL)
err(EX_OSERR, "data malloc");
bytes_read = read(fd, data, bytes_expected);
if (bytes_read != bytes_expected)
printf("Read only %d of %d bytes %d\n", \
bytes_read, bytes_expected,EX_DATAERR);
/* ... operate on data ... */
printf("\n");
int i=0;
int counter=0;
char ch=data[0];
char message[512];
Message* newMessage;
while(i!=bytes_read)
{
while(ch!='\n')
{
message[counter]=ch;
i++;
counter++;
ch =data[i];
}
message[counter]='\n';
message[counter+1]='\0';
//---------------------------------------------------
newMessage = (Message*)parser(message);
MessageProcess(newMessage,AssetMap);
//--------------------------------------------------
//printf("idNUM %e\n",newMessage->idNum);
free(newMessage);
i++;
counter=0;
ch =data[i];
}
free(data);
}
Here, I have allocated 100MB of data with malloc, and passed a file big enough(not 500MB) size of 926KB about. When I pass small files, it reads and exits like a charm, but when I pass a big enough file, the program executes till some point after which it just hangs. I suspect it either entered an infinite loop, or there is memory leak.
EDIT For better understanding I stripped away all unnecessary function calls, and checked what happens, when given a large file as input. I have attached the modified code
void ReadBinary(char *infile,HXmap* AssetMap)
{
int fd;
size_t bytes_read, bytes_expected = 500000000*sizeof(char);
char *data;
if ((fd = open(infile,O_RDONLY)) < 0)
err(EX_NOINPUT, "%s", infile);
if ((data = malloc(bytes_expected)) == NULL)
err(EX_OSERR, "data malloc");
bytes_read = read(fd, data, bytes_expected);
if (bytes_read != bytes_expected)
printf("Read only %d of %d bytes %d\n", \
bytes_read, bytes_expected,EX_DATAERR);
/* ... operate on data ... */
printf("\n");
int i=0;
int counter=0;
char ch=data[0];
char message[512];
while(i<=bytes_read)
{
while(ch!='\n')
{
message[counter]=ch;
i++;
counter++;
ch =data[i];
}
message[counter]='\n';
message[counter+1]='\0';
i++;
printf("idNUM \n");
counter=0;
ch =data[i];
}
free(data);
}
What looks like is, it prints a whole lot of idNUM
's and then poof segmentation fault
I think this is an interesting behaviour, and to me it looks like there is some problem with memory
FURTHER EDIT I changed back the i!=bytes_read
it gives no segmentation fault. When I check for i<=bytes_read
it blows past the limits in the innerloop.(courtesy gdb)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
最明显的问题是:
除非文件(或您刚刚读取的块)的最后一个字符是
\n
,否则您将超出data< 的末尾/code> 数组,很可能会一路破坏堆栈(因为您没有检查对
message
的写入是否在范围内)。The most glaring problem is this:
Unless the last character of the file (or the block that you've just read) is
\n
, you'll go past the end of thedata
array, most probably smashing the stack along the way (since you're not checking whether your write tomessage
is within bounds).尝试以下循环。基本上,它会重构您的实现,因此只有一个地方
i
会递增。有两个地方才是造成你麻烦的原因。或者,您可以采用“不要重新发明轮子”的方法并使用标准
strtok
调用:Try the following loop. Basically, it refactors your implementation so there is only one place where
i
is incremented. Having two places is what's causing your trouble.Or, you could take the "don't reinvent the wheel" approach and use a standard
strtok
call:在您使用的系统上,500,000,000 是否可能大于最大的 size_t?如果是这样,bytes_expected 可能会滚动到某个较小的值。然后 bytes_read 也会效仿,最终你得到的数据块比你实际预期的要小。结果是,对于大数据,数据的最后一个字符不太可能是“\n”,因此您在内部循环中直接跳过它并开始访问数据末尾之外的字符。出现段错误。
Is it possible that on the system you're using, 500,000,000 is larger than the largest size_t? If so, bytes_expected may be rolling over to some smaller value. Then bytes_read is following suit, and you're ending up taking a smaller chunk of data than you actually expect. The result would be that for large data, the last character of data is unlikely to be a '\n', so you blow right past it in that inner loop and start accessing characters beyond the end of data. Segfault follows.