ungetc:推回的字节数

发布于 2024-12-10 12:55:17 字数 125 浏览 0 评论 0原文

ungetc 仅保证接受一个字节的推回。另一方面,我在 Windows 和 Linux 上测试了它,它似乎可以使用两个字节。

是否有任何平台(例如任何当前的 Unix 系统)实际上只占用一个字节?

ungetc is only guaranteed to take one byte of pushback. On the other hand, I've tested it on Windows and Linux and it seems to work with two bytes.

Are there any platforms (e.g. any current Unix systems) on which it actually only takes one byte?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

墨小墨 2024-12-17 12:55:17

C99标准(以及之前的C89标准)明确表示:

保证一个字符的推回。如果 ungetc 函数调用次数过多
同一流上的时间,无需干预读取或文件定位操作
流,操作可能会失败。

因此,为了便于移植,您不应假设有多个推回特征。

话虽如此,在 MacOS X 10.7.2 (Lion) 和 RHEL 5 (Linux, x86/64) 上,我尝试过:

#include <stdio.h>
int main(void)
{
    int i;
    for (i = 0; i < 4096; i++)
    {
        int c = i % 16 + 64;
        if (ungetc(c, stdin) != c)
        {
            fprintf(stderr, "Error at count = %d\n", i);
            return(1);
        }
    }
    printf("No error up to count = %d\n", i-1);
    return(0);
}

在这两个平台上都没有出现错误。相比之下,在 Solaris 10 (SPARC) 上,我在“count = 4”处收到错误,在 Solaris 11.3 上也是如此。更糟糕的是,在 HP-UX 11.00 (PA-RISC)、HP-UX 11.23 (Itanium) 和 HP-UX 11.31 (Itanium) 上,我在“count = 1”处收到错误 - 违背了 2 是安全的理论。同样,AIX 6.0(和 7.2)在“count = 1”处给出错误。

摘要

  • Linux:big (4 KiB)
  • MacOS X:big (4 KiB)
  • Solaris:4
  • HP-UX:1
  • AIX:1

因此,AIX 和 HP-UX 只允许在没有任何数据的输入文件上推回一个字符继续阅读。这是一个令人厌恶的案例;一旦从文件中读取了一些数据,它们可能会提供更多的推回能力(但在 AIX 上进行的简单测试,在循环之前添加 getchar() 不会改变推回能力)。

2023 年 12 月,上述程序在使用 MSVC 19.15.26730 for x64 的 Windows Server 2016 Standard 上计数 = 1 时失败。这与 rwallace 发现的不同。

The C99 standard (and the C89 standard before that) said unequivocally:

One character of pushback is guaranteed. If the ungetc function is called too many
times on the same stream without an intervening read or file positioning operation on that
stream, the operation may fail.

So, to be portable, you do not assume more than one character of pushback.

Having said that, on both MacOS X 10.7.2 (Lion) and RHEL 5 (Linux, x86/64), I tried:

#include <stdio.h>
int main(void)
{
    int i;
    for (i = 0; i < 4096; i++)
    {
        int c = i % 16 + 64;
        if (ungetc(c, stdin) != c)
        {
            fprintf(stderr, "Error at count = %d\n", i);
            return(1);
        }
    }
    printf("No error up to count = %d\n", i-1);
    return(0);
}

I got no error on either platform. By contrast, on Solaris 10 (SPARC), I got an error at 'count = 4' — and the same on Solaris 11.3. Worse, on HP-UX 11.00 (PA-RISC), HP-UX 11.23 (Itanium), and HP-UX 11.31 (Itanium), I got an error at 'count = 1' - belying the theory that 2 is safe. Similarly, AIX 6.0 (and 7.2) gave an error at 'count = 1'.

Summary

  • Linux: big (4 KiB)
  • MacOS X: big (4 KiB)
  • Solaris: 4
  • HP-UX: 1
  • AIX: 1

So, AIX and HP-UX only allow one character of pushback on an input file that has not had any data read on it. This is a nasty case; they might provide much more pushback capacity once some data has been read from the file (but a simple test on AIX adding a getchar() before the loop didn't change the pushback capacity).

In December 2023, the program above failed at count = 1 on Windows Server 2016 Standard using MSVC 19.15.26730 for x64. This is different from what rwallace found.

木緿 2024-12-17 12:55:17

支持 2 个推回字符的实现可能会这样做,以便 scanf 可以使用 ungetc 进行推回,而不需要第二个几乎相同的机制。对于应用程序程序员来说,这意味着即使调用 ungetc 两次似乎可行,但它可能并不在所有情况下都可靠 - 例如,如果流上的最后一个操作是 fscanf 并且它必须使用推回,您可能只能使用 ungetc 一个字符。

无论如何,依赖于超过一个字符的 ungetc 推回是不可移植的,所以我强烈建议不要编写需要它的代码......

Implementations which support 2 characters of pushback probably do so in order than scanf can use ungetc for its pushback rather than requiring a second nearly-identical mechanism. What this means for you as the application programmer is that even if calling ungetc twice seems to work, it might not be reliable in all situations -- for example, if the last operation on the stream was fscanf and it had to use pushback, you can probably only ungetc one character.

In any case, it's nonportable to rely on having more than one character of ungetc pushback, so I would highly advise against writing code that needs it...

空城缀染半城烟沙 2024-12-17 12:55:17

这里有一些帖子建议为了 scanf 支持 2 个字符是有意义的。

我认为这是不对的: scanf 只需要一个,这确实是限制的原因。最初的实现(早在 70 年代中期)支持 100,并且手册中有一个注释:将来我们可能决定仅支持 1,因为这就是 scanf 所需要的。 参见原始手册的第 3 页(可能不是原始的,但相当老了。)

为了更生动地看到 scanf 仅需要 1 个字符,请考虑 scanf%u 功能的代码。

int c;
while isspace(c=getc()) {} // skip white space
unsigned num = 0;
while isdigit(c)
    num = num*10 + c-'0',
    c = getc();
ungetc(c);

这里只需要对 ungetc() 进行一次调用。 scanf 没有理由需要一个单独的字符:它可以与用户共享。

There are some posts here suggesting that it makes sense to support 2 chars for the sake of scanf.

I don't think this is right: scanf only needs one, and this is indeed the reason for the limit. The original implementation (back in the mid 70s) supported 100, and the manual had a note: in the future we may decide to support only 1, since that's all that scanf needs. See page 3 of the original manual (Maybe not original, but pretty old.)

To see more vividly that scanf needs only 1 char, consider this code for the %u feature of scanf.

int c;
while isspace(c=getc()) {} // skip white space
unsigned num = 0;
while isdigit(c)
    num = num*10 + c-'0',
    c = getc();
ungetc(c);

Only a single call to ungetc() is needed here. There is no reason why scanf needs a char all to itself: it can share with the user.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文