fscanf 读取字符串时出现问题

发布于 2024-08-14 17:15:37 字数 575 浏览 2 评论 0原文

我正在读取 .txt 文件。我正在使用 fscanf 来获取格式化的数据。 我遇到问题的行是这样的:

result = fscanf(fp, "%s", ap->name);

这很好,直到我的名字带有空格,例如:St Ives 所以我用它来读取空格:

result = fscanf(fp, "%[^\n]s", ap->name);

但是,当我尝试读取名字(没有空格)时,它不起作用并且弄乱了其他 fscanf。

但我使用 [^\n] 它在我正在使用的不同文件中工作正常。不确定发生了什么。

如果我使用 fgets 代替上面的 fscanf,我会在变量中得到“\n”。

编辑//

好的,所以如果我使用:

result = fscanf(fp, "%s", ap->name);
result = fscanf(fp, "%[^\n]s", ap->name);

这允许我读取没有空格的字符串。但是当我得到一个带有空格的“名称”时,它不起作用。

I'm reading in a .txt file. I'm using fscanf to get the data as it is formatted.
The line I'm having problems with is this:

result = fscanf(fp, "%s", ap->name);

This is fine until I have a name with a whitespace eg: St Ives
So I use this to read in the white space:

result = fscanf(fp, "%[^\n]s", ap->name);

However, when I try to read in the first name (with no white space) it just doesn't work and messes up the other fscanf.

But I use the [^\n] it works fine within a different file I'm using. Not sure what is happening.

If I use fgets in the place of the fscanf above I get "\n" in the variable.

Edit//

Ok, so if I use:

result = fscanf(fp, "%s", ap->name);
result = fscanf(fp, "%[^\n]s", ap->name);

This allows me to read in a string with no white space. But When I get a "name" with whitespace it doesn't work.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

最舍不得你 2024-08-21 17:15:37

这样做的一个问题

result = fscanf(fp, "%[^\n]s", ap->name);

是:格式说明符末尾有一个额外的 s 。整个格式说明符应该是 %[^\n],它表示“读取由非换行符组成的字符串”。额外的 s 不是格式说明符的一部分,因此它被解释为文字:“从输入中读取下一个字符;如果它是“s”,则继续,否则失败。”

不过,额外的 s 实际上并不会伤害您。您确切地知道输入的下一个字符是什么:换行符。它不匹配,输入处理在那里停止,但这并不重要,因为它是格式说明符的末尾。但是,如果在同一格式字符串中在此格式说明符之后还有其他格式说明符,这会导致问题。

真正的问题是您没有使用换行符:您只读取换行符之前的所有字符,而不是换行符本身。要解决这个问题,您应该这样做:

result = fscanf(fp, "%[^\n]%*c", ap->name);

%*c 说明符表示读取字符 (c),但不要将其分配给任何变量 (*)。如果省略 *,则必须传递 fscanf() 另一个包含指向字符的指针(char*)的参数,其中然后它会存储它读入的结果字符。

您也可以使用 %[^\n]\n,但这也会读取换行符后面的任何空格,这可能不是什么你想要的。当 fscanf 在其格式说明符(空格、换行符或制表符)中找到空格时,它会消耗尽可能多的空格(即您可以认为它消耗与正则表达式 < 匹配的最长字符串)代码>[\t\n]*)。

最后,您还应该指定最大长度以避免缓冲区溢出。您可以通过将缓冲区长度放在 %[ 之间来实现此目的。例如,如果 ap->name 是 256 个字符的缓冲区,您应该这样做:

result = fscanf(fp, "%255[^\n]%*c", ap->name);

这对于静态分配的数组非常有用;不幸的是,如果数组在运行时动态调整大小,则没有简单的方法将缓冲区大小传递给 fscanf。您必须使用 sprintf 创建格式字符串,例如:

char format[256];
snprintf(format, sizeof(format), "%%%d[^\n]%%*c", buffer_size - 1);
result = fscanf(fp, format, ap->name);

One problem with this:

result = fscanf(fp, "%[^\n]s", ap->name);

is that you have an extra s at the end of your format specifier. The entire format specifier should just be %[^\n], which says "read in a string which consists of characters which are not newlines". The extra s is not part of the format specifier, so it's interpreted as a literal: "read the next character from the input; if it's an "s", continue, otherwise fail."

The extra s doesn't actually hurt you, though. You know exactly what the next character of input: a newline. It doesn't match, and input processing stops there, but it doesn't really matter since it's the end of your format specifier. This would cause problems, though, if you had other format specifiers after this one in the same format string.

The real problem is that you're not consuming the newline: you're only reading in all of the characters up to the newline, but not the newline itself. To fix that, you should do this:

result = fscanf(fp, "%[^\n]%*c", ap->name);

The %*c specifier says to read in a character (c), but don't assign it to any variable (*). If you omitted the *, you would have to pass fscanf() another parameter containing a pointer to a character (a char*), where it would then store the resulting character that it read in.

You could also use %[^\n]\n, but that would also read in any whitespace which followed the newline, which may not be what you want. When fscanf finds whitespace in its format specifier (a space, newline, or tab), it consumes as much whitespace as it can (i.e. you can think of it consuming the longest string that matches the regular expression [ \t\n]*).

Finally, you should also specify a maximum length to avoid buffer overruns. You can do this by placing the buffer length in between the % and the [. For example, if ap->name is a buffer of 256 characters, you should do this:

result = fscanf(fp, "%255[^\n]%*c", ap->name);

This works great for statically allocated arrays; unfortunately, if the array is dyamically sized at runtime, there's no easy to way to pass the buffer size to fscanf. You'll have to create the format string with sprintf, e.g.:

char format[256];
snprintf(format, sizeof(format), "%%%d[^\n]%%*c", buffer_size - 1);
result = fscanf(fp, format, ap->name);
七分※倦醒 2024-08-21 17:15:37

朱姆写道:

如果我使用 fgets 代替上面的 fscanf,我会在变量中得到“\n”。

这是一个更容易解决的问题,所以就解决吧:

fgets( ap->name, MAX, fp ) ;
nlptr = strrchr ( ap->name, '\n' ) ;
if( nlptr != 0 )
{
    *nlptr = '\0' ;
}

Jumm wrote:

If I use fgets in the place of the fscanf above I get "\n" in the variable.

Which is a far easier problem to solve so go with it:

fgets( ap->name, MAX, fp ) ;
nlptr = strrchr ( ap->name, '\n' ) ;
if( nlptr != 0 )
{
    *nlptr = '\0' ;
}
单调的奢华 2024-08-21 17:15:37

我不确定你的意思 [^\n] 应该如何工作。 [] 是一个修饰符,表示“接受一个字符,除了该块内的任何字符”。 ^ 反转条件。带有 fscanf 的 %s 仅读取,直到遇到分隔符。对于包含空格和换行符的字符串,请使用 fgets 和 sscanf 的组合,并指定长度限制。

I'm not sure how you mean [^\n] is suppose to work. [] is a modifier which says "accept one character except any of the characters which is inside this block". The ^ inverts the condition. %s with fscanf only reads until it comes across a delimiter. For strings with spaces and newlines in them, use a combination of fgets and sscanf instead, and specify a restriction on the length.

白况 2024-08-21 17:15:37

据我所知,您试图在 fscanf 函数中暗示不存在的正则表达式是不存在的,据我所知,我也没有在任何地方看到过它 - 请启发我。

读取字符串的格式说明符是 %s,您可能需要这样做,%s\n 将拾取换行符。

但看在 Pete 的份上,不要使用标准的旧 gets 系列函数 如上面 Clifford 的答案所指定 因为这是缓冲区溢出发生的地方,并被用于 1990 年代臭名昭著的蠕虫中- Morris 蠕虫,更具体地说是在 fingerd 守护进程中,它曾经调用 gets 导致混乱。幸运的是,现在这个问题已经被修复了。而且,很多程序员已经被灌输了不使用该功能的心态。

甚至微软也采用了 gets 系列函数的安全版本,它指定一个参数来指示缓冲区的长度。

编辑
我的错 - 我没有意识到 Clifford 确实指定了输入的最大长度......哎呀!对不起!克利福德的回答是正确的!所以对克利福德的答案+1。

感谢尼尔指出我的错误...

希望这有帮助,
此致,
汤姆.

There is no such thing as I gather you are trying to imply a regular expression in the fscanf function which does not exist, not that to my knowledge nor have I seen it anywhere - enlighten me on this.

The format specifier for reading a string is %s, it could be that you need to do it this way, %s\n which will pick up the newline.

But for pete's sake do not use the standard old gets family functions as specified by Clifford's answer above as that is where buffer overflows happen and was used in a infamous worm of the 1990's - the Morris Worm, more specifically in the fingerd daemon, that used to call gets that caused chaos. Fortunately, now, that has now been patched. And furthermore, a lot of programmers have been drilled into the mentality not to use the function.

Even Microsoft has adopted a safe version of gets family of functions, that specifies a parameter to indicate the length of buffer instead.

EDIT
My bad - I did not realize that Clifford indeed has specified the max length for input...Whoops! Sorry! Clifford's answer is correct! So +1 to Clifford's answer.

Thanks Neil for pointing out my error...

Hope this helps,
Best regards,
Tom.

稚然 2024-08-21 17:15:37

我发现了问题。

正如 Paul Tomblin 所说,我在上面的字段中多了一个换行符。所以使用 tommieb75 说的我用过的:

result = fscanf(fp, "%s\n", ap->code);
result = fscanf(fp, "%[^\n]s", ap->name);

这解决了它!

感谢您的帮助。

I found the problem.

As Paul Tomblin said, I had an extra new line character in the field above. So using what tommieb75 said I used:

result = fscanf(fp, "%s\n", ap->code);
result = fscanf(fp, "%[^\n]s", ap->name);

And this fixed it!

Thanks for your help.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文