C: strtok 返回第一个值,然后返回 NULL
我只是想返回字符串中的每个单词,但 strtok 返回第一个单词,然后立即返回 null:
int main(int argc, char *argv[]) {
// Get the interesting file contents
char *filestr = get_file(argv[1]);
printf("%s\n", filestr);
char *word;
word = strtok(filestr, ";\"'-?:{[}](), \n");
while (word != NULL) {
word = strtok(NULL, ";\"'-?:{[}](), \n");
printf("This was called. %s\n", word);
}
exit(0);
}
get_file 只是打开指定的路径并将文件的内容作为字符串返回。上面显示的 printf("%s\n", filestr);
命令成功打印出任何给定文件的全部内容。因此,我不认为 get_file() 是问题所在。
如果我在 char test[] = "this is a test string" 而不是 filestr 上调用 strtok,那么它会正确返回每个单词。但是,如果我将 get_file() 获取的文件内容设置为“这是一个字符串”,那么它会返回“this”,然后返回 (null)。
根据请求,以下是 get_file() 的代码:
// Take the path to the file as a string and return a string with all that
// file's contents
char *get_file (char *dest) {
// Define variables that will be used
size_t length;
FILE* file;
char* data;
file = fopen(dest, "rb");
// Go to end of stream
fseek(file, 0, SEEK_END);
// Set the int length to the end seek value of the stream
length = ftell(file);
// Go back to the beginning of the stream for when we actually read contents
rewind(file);
// Define the size of the char array str
data = (char*) malloc(sizeof(char) * length + 1);
// Read the stream into the string str
fread(data, 1, length, file);
// Close the stream
fclose(file);
return data;
}
I am simply trying to return every word in a string, but strtok returns the first word and then immediately null thereafter:
int main(int argc, char *argv[]) {
// Get the interesting file contents
char *filestr = get_file(argv[1]);
printf("%s\n", filestr);
char *word;
word = strtok(filestr, ";\"'-?:{[}](), \n");
while (word != NULL) {
word = strtok(NULL, ";\"'-?:{[}](), \n");
printf("This was called. %s\n", word);
}
exit(0);
}
get_file simply opens the specified path and returns the file's contents as a string. The printf("%s\n", filestr);
command shown above successfully prints out the entirety of any given file. Hence, I do not think get_file() is the problem.
If I call strtok on char test[] = "this is a test string"
instead of filestr, then it correctly returns each of the words. If, however, I make the contents of the file gotten by get_file() to be "this is a string," then it returns "this" and then (null).
By request, here is the code for get_file():
// Take the path to the file as a string and return a string with all that
// file's contents
char *get_file (char *dest) {
// Define variables that will be used
size_t length;
FILE* file;
char* data;
file = fopen(dest, "rb");
// Go to end of stream
fseek(file, 0, SEEK_END);
// Set the int length to the end seek value of the stream
length = ftell(file);
// Go back to the beginning of the stream for when we actually read contents
rewind(file);
// Define the size of the char array str
data = (char*) malloc(sizeof(char) * length + 1);
// Read the stream into the string str
fread(data, 1, length, file);
// Close the stream
fclose(file);
return data;
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您是否传递了一个包含空字符的二进制文件?
get_file () 正确返回字符缓冲区,但是(例如),如果我给你的函数一个 .png 文件,缓冲区看起来像这样
(gdb) p data[0] @32
$5 = "\211PNG\r\n\032\n\000\000\000\rIHDR\000\000\003\346\000\000\002\230\b\006\000\000\000\376? ”
可以看到PNG\r\n后面有空字符,所以不能真正把get_file()的返回值当成字符串。您需要将其视为字符数组并手动返回总长度,而不是依赖于空终止。
然后,正如目前所写的那样,您不能依赖 strtok,因为它在遇到第一个空字符后会停止处理。您可以通过传递数据并将所有空字符转换为其他字符来解决此问题,或者您可以实现一个适用于给定长度的缓冲区的 strtok 版本。
Are you passing a binary file with null characters in it?
get_file () is correctly returning a character buffer, but (for example), if I give your function a .png file the buffer looks like this
(gdb) p data[0] @32
$5 = "\211PNG\r\n\032\n\000\000\000\rIHDR\000\000\003\346\000\000\002\230\b\006\000\000\000\376?"
You can see that after the PNG\r\n, it has null characters, so you can't really treat the return value of get_file () as a string. You'd need to treat it like a character array and return the total length manually and not rely on null termination.
Then, as its currently written, you can't rely on strtok, since it stop processing after it hits your first null characters. You could work around this by doing a pass over your data and converting all the null characters into something else, or you could implement a version of strtok that works on buffers of a given length.