scanf的缺点

发布于 2024-08-25 14:05:15 字数 123 浏览 11 评论 0原文

我想知道scanf()的缺点。

在许多网站上,我读到使用 scanf 可能会导致缓冲区溢出。这是什么原因呢? scanf 还有其他缺点吗?

I want to know the disadvantages of scanf().

In many sites, I have read that using scanf might cause buffer overflows. What is the reason for this? Are there any other drawbacks with scanf?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

黎歌 2024-09-01 14:05:15

到目前为止,大多数答案似乎都集中在字符串缓冲区溢出问题上。实际上,可与 scanf 函数一起使用的格式说明符支持显式字段宽度设置,这限制了输入的最大大小并防止缓冲区溢出。这使得对 scanf 中存在字符串缓冲区溢出危险的流行指控几乎毫无根据。声称 scanf 在某种程度上类似于 gets 在这方面是完全错误的。 scanfgets 之间存在重大的质的区别:scanf 确实为用户提供了防止字符串缓冲区溢出的功能,而 得到 没有。

有人可能会说这些 scanf 功能很难使用,因为字段宽度必须嵌入到格式字符串中(无法通过可变参数传递它,因为它可以在 >printf)。这确实是事实。 scanf 在这方面的设计确实相当糟糕。但尽管如此,任何声称 scanf 在字符串缓冲区溢出安全性方面无可救药地被破坏的说法都是完全虚假的,并且通常是由懒惰的程序员提出的。

scanf 的真正问题具有完全不同的性质,尽管它也与溢出有关。当 scanf 函数用于将数字的十进制表示形式转换为算术类型的值时,它不提供算术溢出保护。如果发生溢出,scanf 会产生未定义的行为。因此,在 C 标准库中执行转换的唯一正确方法是来自 strto... 系列的函数。

因此,综上所述,scanf 的问题在于很难(尽管可能)正确且安全地使用字符串缓冲区。并且不可能安全地用于算术输入。后者才是真正的问题。前者只是带来不便。

PS 以上内容旨在介绍整个 scanf 函数系列(还包括 fscanfsscanf)。特别是对于 scanf 来说,明显的问题是使用严格格式的函数来读取潜在的交互式输入的想法是相当值得怀疑的。

Most of the answers so far seem to focus on the string buffer overflow issue. In reality, the format specifiers that can be used with scanf functions support explicit field width setting, which limit the maximum size of the input and prevent buffer overflow. This renders the popular accusations of string-buffer overflow dangers present in scanf virtually baseless. Claiming that scanf is somehow analogous to gets in the respect is completely incorrect. There's a major qualitative difference between scanf and gets: scanf does provide the user with string-buffer-overflow-preventing features, while gets doesn't.

One can argue that these scanf features are difficult to use, since the field width has to be embedded into format string (there's no way to pass it through a variadic argument, as it can be done in printf). That is actually true. scanf is indeed rather poorly designed in that regard. But nevertheless any claims that scanf is somehow hopelessly broken with regard to string-buffer-overflow safety are completely bogus and usually made by lazy programmers.

The real problem with scanf has a completely different nature, even though it is also about overflow. When scanf function is used for converting decimal representations of numbers into values of arithmetic types, it provides no protection from arithmetic overflow. If overflow happens, scanf produces undefined behavior. For this reason, the only proper way to perform the conversion in C standard library is functions from strto... family.

So, to summarize the above, the problem with scanf is that it is difficult (albeit possible) to use properly and safely with string buffers. And it is impossible to use safely for arithmetic input. The latter is the real problem. The former is just an inconvenience.

P.S. The above in intended to be about the entire family of scanf functions (including also fscanf and sscanf). With scanf specifically, the obvious issue is that the very idea of using a strictly-formatted function for reading potentially interactive input is rather questionable.

柠檬色的秋千 2024-09-01 14:05:15

scanf 的问题(至少)是:

  • 使用 %s 从用户那里获取字符串,这导致该字符串可能比你的缓冲区长,从而导致溢出。
  • 扫描失败的可能性使您的文件指针处于不确定的位置。

我非常喜欢使用 fgets 读取整行,这样您就可以限制读取的数据量。如果您有一个 1K 缓冲区,并且使用 fgets 读入一行,您可以通过没有终止换行符(文件的最后一行没有换行符)来判断该行是否太长。尽管有换行符)。

然后,您可以向用户投诉,或者为该行的其余部分分配更多空间(如果需要,可以连续分配,直到有足够的空间)。无论哪种情况,都不存在缓冲区溢出的风险。

读完该行后,您知道您已位于下一行,因此那里没有问题。然后,您可以sscanf您的字符串到您想要的内容,而无需保存和恢复文件指针以进行重新读取。

这是我经常使用的一段代码,以确保在向用户询问信息时不会发生缓冲区溢出。

如果需要的话,可以很容易地调整它以使用标准输入以外的文件,并且您还可以让它分配自己的缓冲区(并不断增加它直到它足够大),然后再将其返回给调用者(尽管调用者将负责当然是为了释放它)。

#include <stdio.h>
#include <string.h>

#define OK         0
#define NO_INPUT   1
#define TOO_LONG   2
#define SMALL_BUFF 3
static int getLine (char *prmpt, char *buff, size_t sz) {
    int ch, extra;

    // Size zero or one cannot store enough, so don't even
    // try - we need space for at least newline and terminator.

    if (sz < 2)
        return SMALL_BUFF;

    // Output prompt.

    if (prmpt != NULL) {
        printf ("%s", prmpt);
        fflush (stdout);
    }

    // Get line with buffer overrun protection.

    if (fgets (buff, sz, stdin) == NULL)
        return NO_INPUT;

    // Catch possibility of `\0` in the input stream.

    size_t len = strlen(buff);
    if (len < 1)
        return NO_INPUT;

    // If it was too long, there'll be no newline. In that case, we flush
    // to end of line so that excess doesn't affect the next call.

    if (buff[len - 1] != '\n') {
        extra = 0;
        while (((ch = getchar()) != '\n') && (ch != EOF))
            extra = 1;
        return (extra == 1) ? TOO_LONG : OK;
    }

    // Otherwise remove newline and give string back to caller.
    buff[len - 1] = '\0';
    return OK;
}

并且,它的测试驱动程序:

// Test program for getLine().

int main (void) {
    int rc;
    char buff[10];

    rc = getLine ("Enter string> ", buff, sizeof(buff));
    if (rc == NO_INPUT) {
        // Extra NL since my system doesn't output that on EOF.
        printf ("\nNo input\n");
        return 1;
    }

    if (rc == TOO_LONG) {
        printf ("Input too long [%s]\n", buff);
        return 1;
    }

    printf ("OK [%s]\n", buff);

    return 0;
}

最后,进行测试运行以展示它的实际效果:

$ printf "\0" | ./tstprg     # Singular NUL in input stream.
Enter string>
No input

$ ./tstprg < /dev/null       # EOF in input stream.
Enter string>
No input

$ ./tstprg                   # A one-character string.
Enter string> a
OK [a]

$ ./tstprg                   # Longer string but still able to fit.
Enter string> hello
OK [hello]

$ ./tstprg                   # Too long for buffer.
Enter string> hello there
Input too long [hello the]

$ ./tstprg                   # Test limit of buffer.
Enter string> 123456789
OK [123456789]

$ ./tstprg                   # Test just over limit.
Enter string> 1234567890
Input too long [123456789]

The problems with scanf are (at a minimum):

  • using %s to get a string from the user, which leads to the possibility that the string may be longer than your buffer, causing overflow.
  • the possibility of a failed scan leaving your file pointer in an indeterminate location.

I very much prefer using fgets to read whole lines in so that you can limit the amount of data read. If you've got a 1K buffer, and you read a line into it with fgets you can tell if the line was too long by the fact there's no terminating newline character (last line of a file without a newline notwithstanding).

Then you can complain to the user, or allocate more space for the rest of the line (continuously if necessary until you have enough space). In either case, there's no risk of buffer overflow.

Once you've read the line in, you know that you're positioned at the next line so there's no problem there. You can then sscanf your string to your heart's content without having to save and restore the file pointer for re-reading.

Here's a snippet of code which I frequently use to ensure no buffer overflow when asking the user for information.

It could be easily adjusted to use a file other than standard input if necessary and you could also have it allocate its own buffer (and keep increasing it until it's big enough) before giving that back to the caller (although the caller would then be responsible for freeing it, of course).

#include <stdio.h>
#include <string.h>

#define OK         0
#define NO_INPUT   1
#define TOO_LONG   2
#define SMALL_BUFF 3
static int getLine (char *prmpt, char *buff, size_t sz) {
    int ch, extra;

    // Size zero or one cannot store enough, so don't even
    // try - we need space for at least newline and terminator.

    if (sz < 2)
        return SMALL_BUFF;

    // Output prompt.

    if (prmpt != NULL) {
        printf ("%s", prmpt);
        fflush (stdout);
    }

    // Get line with buffer overrun protection.

    if (fgets (buff, sz, stdin) == NULL)
        return NO_INPUT;

    // Catch possibility of `\0` in the input stream.

    size_t len = strlen(buff);
    if (len < 1)
        return NO_INPUT;

    // If it was too long, there'll be no newline. In that case, we flush
    // to end of line so that excess doesn't affect the next call.

    if (buff[len - 1] != '\n') {
        extra = 0;
        while (((ch = getchar()) != '\n') && (ch != EOF))
            extra = 1;
        return (extra == 1) ? TOO_LONG : OK;
    }

    // Otherwise remove newline and give string back to caller.
    buff[len - 1] = '\0';
    return OK;
}

And, a test driver for it:

// Test program for getLine().

int main (void) {
    int rc;
    char buff[10];

    rc = getLine ("Enter string> ", buff, sizeof(buff));
    if (rc == NO_INPUT) {
        // Extra NL since my system doesn't output that on EOF.
        printf ("\nNo input\n");
        return 1;
    }

    if (rc == TOO_LONG) {
        printf ("Input too long [%s]\n", buff);
        return 1;
    }

    printf ("OK [%s]\n", buff);

    return 0;
}

Finally, a test run to show it in action:

$ printf "\0" | ./tstprg     # Singular NUL in input stream.
Enter string>
No input

$ ./tstprg < /dev/null       # EOF in input stream.
Enter string>
No input

$ ./tstprg                   # A one-character string.
Enter string> a
OK [a]

$ ./tstprg                   # Longer string but still able to fit.
Enter string> hello
OK [hello]

$ ./tstprg                   # Too long for buffer.
Enter string> hello there
Input too long [hello the]

$ ./tstprg                   # Test limit of buffer.
Enter string> 123456789
OK [123456789]

$ ./tstprg                   # Test just over limit.
Enter string> 1234567890
Input too long [123456789]
颜漓半夏 2024-09-01 14:05:15

来自 comp.lang.c FAQ: 为什么每个人都说不要使用 scanf?我应该用什么来代替?

scanf 有许多问题 - 请参阅问题 12.1712.18a12.19。此外,其 %s 格式与 gets() 具有相同的问题(请参阅问题 12.23)—很难保证接收缓冲区不会溢出。 [脚注]

更一般地说,scanf 是为相对结构化、格式化的输入而设计的(它的名字实际上源自“scan formatted”)。如果你注意的话,它会告诉你它是成功还是失败,但它只能告诉你失败的大概位置,而根本不能告诉你失败的方式或原因。您几乎没有机会进行任何错误恢复。

然而,交互式用户输入是最不结构化的输入。精心设计的用户界面将允许用户键入几乎任何内容 - 不仅仅是在需要数字时输入字母或标点符号,还可以输入比预期更多或更少的字符,或者根本不输入任何字符(ie< /em>,只是 RETURN 键),或过早的 EOF,或任何东西。使用 scanf 时,几乎不可能优雅地处理所有这些潜在问题;读取整行(使用 fgets 等),然后使用 sscanf 或其他一些技术来解释它们要容易得多。 (诸如 strtolstrtokatoi 之类的函数通常很有用;另请参阅问题 12.1613.6.) 如果您确实使用任何 scanf 变体,请务必检查返回值以确保找到预期数量的项目。另外,如果您使用%s,请务必防止缓冲区溢出。

顺便请注意,对 scanf 的批评并不一定是对 fscanfsscanf 的控诉。 scanfstdin 读取,它通常是交互式键盘,因此受到的限制最少,导致的问题也最多。另一方面,当数据文件具有已知格式时,使用 fscanf 读取它可能是合适的。使用 sscanf 解析字符串是非常合适的(只要检查返回值),因为它很容易重新获得控制、重新启动扫描、在不匹配时丢弃输入等。

其他链接:

参考文献:K&R2 Sec. 7.4 页。 159

From the comp.lang.c FAQ: Why does everyone say not to use scanf? What should I use instead?

scanf has a number of problems—see questions 12.17, 12.18a, and 12.19. Also, its %s format has the same problem that gets() has (see question 12.23)—it’s hard to guarantee that the receiving buffer won’t overflow. [footnote]

More generally, scanf is designed for relatively structured, formatted input (its name is in fact derived from “scan formatted”). If you pay attention, it will tell you whether it succeeded or failed, but it can tell you only approximately where it failed, and not at all how or why. You have very little opportunity to do any error recovery.

Yet interactive user input is the least structured input there is. A well-designed user interface will allow for the possibility of the user typing just about anything—not just letters or punctuation when digits were expected, but also more or fewer characters than were expected, or no characters at all (i.e., just the RETURN key), or premature EOF, or anything. It’s nearly impossible to deal gracefully with all of these potential problems when using scanf; it’s far easier to read entire lines (with fgets or the like), then interpret them, either using sscanf or some other techniques. (Functions like strtol, strtok, and atoi are often useful; see also questions 12.16 and 13.6.) If you do use any scanf variant, be sure to check the return value to make sure that the expected number of items were found. Also, if you use %s, be sure to guard against buffer overflow.

Note, by the way, that criticisms of scanf are not necessarily indictments of fscanf and sscanf. scanf reads from stdin, which is usually an interactive keyboard and is therefore the least constrained, leading to the most problems. When a data file has a known format, on the other hand, it may be appropriate to read it with fscanf. It’s perfectly appropriate to parse strings with sscanf (as long as the return value is checked), because it’s so easy to regain control, restart the scan, discard the input if it didn’t match, etc.

Additional links:

References: K&R2 Sec. 7.4 p. 159

撩心不撩汉 2024-09-01 14:05:15

scanf 来做你想做的事情是非常困难的。当然可以,但正如大家所说,像 scanf("%s", buf);gets(buf); 一样危险。

例如,paxdiablo 在其读取函数中所做的事情可以通过以下方式完成:

scanf("%10[^\n]%*[^\n]", buf));
getchar();

上面将读取一行,将前 10 个非换行符存储在 buf 中,然后丢弃所有内容,直到(并包括)换行符。因此,paxdiablo 的函数可以使用 scanf 编写,如下所示:

#include <stdio.h>

enum read_status {
    OK,
    NO_INPUT,
    TOO_LONG
};

static int get_line(const char *prompt, char *buf, size_t sz)
{
    char fmt[40];
    int i;
    int nscanned;

    printf("%s", prompt);
    fflush(stdout);

    sprintf(fmt, "%%%zu[^\n]%%*[^\n]%%n", sz-1);
    /* read at most sz-1 characters on, discarding the rest */
    i = scanf(fmt, buf, &nscanned);
    if (i > 0) {
        getchar();
        if (nscanned >= sz) {
            return TOO_LONG;
        } else {
            return OK;
        }
    } else {
        return NO_INPUT;
    }
}

int main(void)
{
    char buf[10+1];
    int rc;

    while ((rc = get_line("Enter string> ", buf, sizeof buf)) != NO_INPUT) {
        if (rc == TOO_LONG) {
            printf("Input too long: ");
        }
        printf("->%s<-\n", buf);
    }
    return 0;
}

scanf 的其他问题之一是它在溢出时的行为。例如,当读取 int 时:

int i;
scanf("%d", &i);

如果发生溢出,则无法安全地使用上述内容。即使对于第一种情况,使用 fgets 读取字符串也比使用 scanf 更简单。

It is very hard to get scanf to do the thing you want. Sure, you can, but things like scanf("%s", buf); are as dangerous as gets(buf);, as everyone has said.

As an example, what paxdiablo is doing in his function to read can be done with something like:

scanf("%10[^\n]%*[^\n]", buf));
getchar();

The above will read a line, store the first 10 non-newline characters in buf, and then discard everything till (and including) a newline. So, paxdiablo's function could be written using scanf the following way:

#include <stdio.h>

enum read_status {
    OK,
    NO_INPUT,
    TOO_LONG
};

static int get_line(const char *prompt, char *buf, size_t sz)
{
    char fmt[40];
    int i;
    int nscanned;

    printf("%s", prompt);
    fflush(stdout);

    sprintf(fmt, "%%%zu[^\n]%%*[^\n]%%n", sz-1);
    /* read at most sz-1 characters on, discarding the rest */
    i = scanf(fmt, buf, &nscanned);
    if (i > 0) {
        getchar();
        if (nscanned >= sz) {
            return TOO_LONG;
        } else {
            return OK;
        }
    } else {
        return NO_INPUT;
    }
}

int main(void)
{
    char buf[10+1];
    int rc;

    while ((rc = get_line("Enter string> ", buf, sizeof buf)) != NO_INPUT) {
        if (rc == TOO_LONG) {
            printf("Input too long: ");
        }
        printf("->%s<-\n", buf);
    }
    return 0;
}

One of the other problems with scanf is its behavior in case of overflow. For example, when reading an int:

int i;
scanf("%d", &i);

the above cannot be used safely in case of an overflow. Even for the first case, reading a string is much more simpler to do with fgets rather than with scanf.

痴情换悲伤 2024-09-01 14:05:15

是的你是对的。 scanf 系列(scanfsscanffscanf..etc)中存在重大安全漏洞,尤其是在读取时一个字符串,因为它们不考虑缓冲区(它们正在读取的)的长度。

示例:

char buf[3];
sscanf("abcdef","%s",buf);

显然缓冲区 buf 可以容纳 MAX 3 字符。但是 sscanf 会尝试将“abcdef”放入其中,从而导致缓冲区溢出。

Yes, you are right. There is a major security flaw in scanf family(scanf,sscanf, fscanf..etc) esp when reading a string, because they don't take the length of the buffer (into which they are reading) into account.

Example:

char buf[3];
sscanf("abcdef","%s",buf);

clearly the the buffer buf can hold MAX 3 char. But the sscanf will try to put "abcdef" into it causing buffer overflow.

我爱人 2024-09-01 14:05:15

scanf 的优点是,一旦您学会了如何使用该工具(就像您在 C 语言中应该做的那样),它就有非常有用的用例。您可以学习如何使用 scanf 和朋友通过阅读和理解手册。如果您在没有严重理解问题的情况下无法读完该手册,这可能表明您不太了解 C。


scanf 和朋友们遭受了不幸的设计选择,这使得在不阅读文档的情况下很难(有时甚至不可能)正确使用,正如其他答案所示。不幸的是,这种情况在整个 C 语言中都会发生,所以如果我建议不要使用 scanf 那么我可能会建议不要使用 C。

最大的缺点之一似乎纯粹是它在不熟悉;正如 C 语言的许多有用功能一样,我们在使用它之前应该充分了解它。关键是要认识到,与 C 的其余部分一样,它看起来简洁且惯用,但这可能会产生微妙的误导。这在 C 语言中很普遍;对于初学者来说,很容易编写他们认为有意义的代码,甚至最初可能对他们有用,但实际上没有意义,并且可能会发生灾难性的失败。

例如,外行通常期望 %s 委托会导致读取一行,虽然这看起来很直观,但不一定正确。将字段描述为一个单词更为合适。强烈建议您阅读每个功能的手册。

如果不提及其缺乏安全性和缓冲区溢出风险,对这个问题的回应会是什么?正如我们已经介绍过的,C 不是一种安全语言,并且允许我们走捷径,可能会以牺牲正确性为代价进行优化,或者更可能是因为我们是懒惰的程序员。因此,当我们知道系统永远不会收到大于固定字节数的字符串时,我们就可以声明一个具有大小的数组并放弃边界检查。我真的不认为这是一个失败;这是一个选择。再次强烈建议您阅读手册,这将向我们揭示此选项。

懒惰的程序员并不是唯一被 scanf 刺痛的人。例如,尝试使用 %d 读取 floatdouble 值的情况并不罕见。他们通常错误地认为实现会在幕后执行某种转换,这是有道理的,因为类似的转换发生在语言的其余部分,但这里的情况并非如此。正如我之前所说,scanf 和它的朋友(实际上还有 C 的其余部分)都是具有欺骗性的;它们看起来简洁且惯用,但事实并非如此。

没有经验的程序员不会被迫考虑操作是否成功。假设当我们告诉 scanf 使用 %d 读取和转换十进制数字序列时,用户输入了完全非数字的内容。我们拦截此类错误数据的唯一方法是检查返回值,我们多久检查一次返回值?

fgets非常相似,当scanf和朋友无法读取他们被告知要读取的内容时,流将处于异常状态;

  • 对于fgets,如果没有足够的空间来存储完整的行,则未读的行的其余部分可能会被错误地视为新行,但实际上它不是新行。
  • 对于 scanf 和类似的情况,如上所述,转换失败,错误的数据在流中未被读取,并且可能被错误地视为不同字段的一部分。

使用 scanf 及其朋友并不比使用 fgets 更容易。如果我们在使用 fgets 时通过查找 '\n' 来检查是否成功,或者在使用 scanf 时通过检查返回值来检查是否成功和朋友们,我们发现使用 fgets 读取了不完整的行,或者使用 scanf 读取了字段失败,然后我们面临着同样的现实:可能会丢弃输入(通常直到并包括下一个换行符)!呜呜呜!

不幸的是,scanf 同时使以这种方式丢弃输入既困难(不直观)又容易(最少的击键)。面对丢弃用户输入的现实,有些人尝试了 scanf("%*[^\n]%*c");,但没有意识到 %*[^\n] 委托在遇到除换行符之外的任何内容时都会失败,因此换行符仍将保留在流中。

稍作调整,通过分离两个格式委托,我们在这里看到了一些成功: scanf("%*[^\n]"); getchar();。尝试使用其他工具通过很少的击键来完成此操作;)

The advantage of scanf is once you learn how use the tool, as you should always do in C, it has immensely useful usecases. You can learn how to use scanf and friends by reading and understanding the manual. If you can't get through that manual without serious comprehension issues, this would probably indicate that you don't know C very well.


scanf and friends suffered from unfortunate design choices that rendered it difficult (and occasionally impossible) to use correctly without reading the documentation, as other answers have shown. This occurs throughout C, unfortunately, so if I were to advise against using scanf then I would probably advise against using C.

One of the biggest disadvantages seems to be purely the reputation it's earned amongst the uninitiated; as with many useful features of C we should be well informed before we use it. The key is to realise that as with the rest of C, it seems succinct and idiomatic, but that can be subtly misleading. This is pervasive in C; it's easy for beginners to write code that they think makes sense and might even work for them initially, but doesn't make sense and can fail catastrophically.

For example, the uninitiated commonly expect that the %s delegate would cause a line to be read, and while that might seem intuitive it isn't necessarily true. It's more appropriate to describe the field read as a word. Reading the manual is strongly advised for every function.

What would any response to this question be without mentioning its lack of safety and risk of buffer overflows? As we've already covered, C isn't a safe language, and will allow us to cut corners, possibly to apply an optimisation at the expense of correctness or more likely because we're lazy programmers. Thus, when we know the system will never receive a string larger than a fixed number of bytes, we're given the ability to declare an array that size and forego bounds checking. I don't really see this as a down-fall; it's an option. Again, reading the manual is strongly advised and would reveal this option to us.

Lazy programmers aren't the only ones stung by scanf. It's not uncommon to see people trying to read float or double values using %d, for example. They're usually mistaken in believing that the implementation will perform some kind of conversion behind the scenes, which would make sense because similar conversions happen throughout the rest of the language, but that's not the case here. As I said earlier, scanf and friends (and indeed the rest of C) are deceptive; they seem succinct and idiomatic but they aren't.

Inexperienced programmers aren't forced to consider the success of the operation. Suppose the user enters something entirely non-numeric when we've told scanf to read and convert a sequence of decimal digits using %d. The only way we can intercept such erroneous data is to check the return value, and how often do we bother checking the return value?

Much like fgets, when scanf and friends fail to read what they're told to read, the stream will be left in an unusual state;

  • In the case of fgets, if there isn't sufficient space to store a complete line, then the remainder of the line left unread might be erroneously treated as though it's a new line when it isn't.
  • In the case of scanf and friends, a conversion failed as documented above, the erroneous data is left unread on the stream and might be erroneously treated as though it's part of a different field.

It's no easier to use scanf and friends than to use fgets. If we check for success by looking for a '\n' when we're using fgets or by inspecting the return value when we use scanf and friends, and we find that we've read an incomplete line using fgets or failed to read a field using scanf, then we're faced with the same reality: We're likely to discard input (usually up until and including the next newline)! Yuuuuuuck!

Unfortunately, scanf both simultaneously makes it hard (non-intuitive) and easy (fewest keystrokes) to discard input in this way. Faced with this reality of discarding user input, some have tried scanf("%*[^\n]%*c");, not realising that the %*[^\n] delegate will fail when it encounters nothing but a newline, and hence the newline will still be left on the stream.

A slight adaptation, by separating the two format delegates and we see some success here: scanf("%*[^\n]"); getchar();. Try doing that with so few keystrokes using some other tool ;)

゛清羽墨安 2024-09-01 14:05:15

我在使用 *scanf() 系列时遇到的问题:

  • %s 和 %[ 转换说明符可能导致缓冲区溢出。是的,您可以指定最大字段宽度,但与 printf() 不同,您不能将其作为 scanf() 调用中的参数;它必须在转换说明符中进行硬编码。
  • %d、%i 等算术溢出的可能性。
  • 检测和拒绝格式错误的输入的能力有限。例如,“12w4”不是有效整数,但 scanf("%d", &value); 将成功转换并将 12 分配给 value,留下“w4”卡在输入流中,导致将来的读取变得混乱。理想情况下,整个输入字符串都应该被拒绝,但是 scanf() 并没有为您提供一个简单的机制来做到这一点。

如果您知道您的输入始终是格式良好的,具有固定长度的字符串和不会溢出的数值,那么 scanf() 是一个很棒的工具。如果您正在处理交互式输入或不能保证格式良好的输入,请使用其他输入。

Problems I have with the *scanf() family:

  • Potential for buffer overflow with %s and %[ conversion specifiers. Yes, you can specify a maximum field width, but unlike with printf(), you can't make it an argument in the scanf() call; it must be hardcoded in the conversion specifier.
  • Potential for arithmetic overflow with %d, %i, etc.
  • Limited ability to detect and reject badly formed input. For example, "12w4" is not a valid integer, but scanf("%d", &value); will successfully convert and assign 12 to value, leaving the "w4" stuck in the input stream to foul up a future read. Ideally the entire input string should be rejected, but scanf() doesn't give you an easy mechanism to do that.

If you know your input is always going to be well-formed with fixed-length strings and numerical values that don't flirt with overflow, then scanf() is a great tool. If you're dealing with interactive input or input that isn't guaranteed to be well-formed, then use something else.

小巷里的女流氓 2024-09-01 14:05:15

这里的许多答案讨论了使用 scanf("%s", buf) 的潜在溢出问题,但最新的 POSIX 规范或多或少通过提供 m 解决了这个问题> 可在 cs[ 格式的格式说明符中使用的赋值分配字符。这将允许 scanf 使用 malloc 分配所需的尽可能多的内存(因此稍后必须使用 free 释放它)。

其使用示例:

char *buf;
scanf("%ms", &buf); // with 'm', scanf expects a pointer to pointer to char.

// use buf

free(buf);

请参阅此处。这种方法的缺点是它是 POSIX 规范中相对较新的补充,并且根本没有在 C 规范中指定,因此目前它仍然相当不可移植。

Many answers here discuss the potential overflow issues of using scanf("%s", buf), but the latest POSIX specification more-or-less resolves this issue by providing an m assignment-allocation character that can be used in format specifiers for c, s, and [ formats. This will allow scanf to allocate as much memory as necessary with malloc (so it must be freed later with free).

An example of its use:

char *buf;
scanf("%ms", &buf); // with 'm', scanf expects a pointer to pointer to char.

// use buf

free(buf);

See here. Disadvantages to this approach is that it is a relatively recent addition to the POSIX specification and it is not specified in the C specification at all, so it remains rather unportable for now.

小ぇ时光︴ 2024-09-01 14:05:15

类似 scanf 的函数有一个大问题 - 缺乏 any 类型安全性。也就是说,你可以编写这样的代码:

int i;
scanf("%10s", &i);

地狱,即使这是“很好”:

scanf("%10s", i);

它比类似于 printf 的函数更糟糕,因为 scanf 需要一个指针,因此更有可能崩溃。

当然,有一些格式说明符检查器,但是,它们并不完美,而且它们不是语言或标准库的一部分。

There is one big problem with scanf-like functions - the lack of any type safety. That is, you can code this:

int i;
scanf("%10s", &i);

Hell, even this is "fine":

scanf("%10s", i);

It's worse than printf-like functions, because scanf expects a pointer, so crashes are more likely.

Sure, there are some format-specifier checkers out there, but, those are not perfect and well, they are not part of the language or the standard library.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文