比 scanf 快吗？

发布于 2024-12-20 19:09:25 字数 533 浏览 0 评论 0原文

我正在使用 scanf("%d", &someint) 对正整数进行大量解析。因为我想看看 scanf 是否是瓶颈，所以我使用 fread 实现了一个简单的整数解析函数，就像：

int result;
char c;

while (fread(&c, sizeof c, 1, stdin), c == ' ' || c == '\n')
    ;

result = c - '0';
while (fread(&c, sizeof c, 1, stdin), c >= '0' || c <= '9') {
     result *= 10;
     result += c - '0';
}

return result;

但令我惊讶的是，这个函数的性能（即使使用内联）约为 50%更糟。难道不应该有可能针对特殊情况改进 scanf 吗？ fread 不是应该很快吗（附加提示：整数（编辑：主要）是 1 或 2 位数字）？

原文

I was doing massive parsing of positive integers using scanf("%d", &someint). As I wanted to see if scanf was a bottleneck, I implemented a naive integer parsing function using fread, just like:

int result;
char c;

while (fread(&c, sizeof c, 1, stdin), c == ' ' || c == '\n')
    ;

result = c - '0';
while (fread(&c, sizeof c, 1, stdin), c >= '0' || c <= '9') {
     result *= 10;
     result += c - '0';
}

return result;

But to my astonishment, the performance of this function is (even with inlining) about 50% worse. Shouldn't there be a possibility to improve on scanf for specialized cases? Isn't fread supposed to be fast (additional hint: Integers are (edit: mostly) 1 or 2 digits)?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

寂寞花火° 2024-12-27 19:09:25

我遇到的开销不是解析本身，而是对 fread 的多次调用（与 fgetc 和朋友相同）。对于每次调用，libc 都必须锁定输入流，以确保两个线程不会互相干扰。锁定是一个非常昂贵的操作。

我们正在寻找的是一个函数，它可以为我们提供缓冲输入（重新实现缓冲实在是太费力了），但又避免了 fgetc 的巨大锁定开销。

如果我们可以保证只有一个线程使用输入流，我们可以使用unlocked_stdio(3)中的函数，例如getchar_unlocked(3)。这是一个示例：

static int parseint(void)
{
    int c, n;

    n = getchar_unlocked() - '0';
    while (isdigit((c = getchar_unlocked())))
        n = 10*n + c-'0';

    return n;
}

上面的版本不检查错误。但它肯定会终止。如果需要错误处理，那么在最后检查 feof(stdin) 和 ferror(stdin) 可能就足够了，或者让调用者来做。

我最初的目的是提交 SPOJ 编程问题的解决方案，其中输入只有空格和数字。所以还有改进的空间，即isdigit检查。

static int parseint(void)
{
    int c, n;

    n = getchar_unlocked() - '0';
    while ((c = getchar_unlocked()) >= '0')
        n = 10*n + c-'0';

    return n;
}

无论是在性能方面还是在便利性和可维护性方面，都非常非常难以击败这个解析例程。

The overhead I was encountering was not the parsing itself but the many calls to fread (same with fgetc, and friends). For each call, the libc has to lock the input stream to make sure two threads aren't stepping on each other's feet. Locking is a very costly operation.

What we're looking for is a function that gives us buffered input (reimplementing buffering is just too much effort) but avoids the huge locking overhead of fgetc.

If we can guarantee that there is only a single thread using the input stream, we can use the functions from unlocked_stdio(3), such as getchar_unlocked(3). Here is an example:

static int parseint(void)
{
    int c, n;

    n = getchar_unlocked() - '0';
    while (isdigit((c = getchar_unlocked())))
        n = 10*n + c-'0';

    return n;
}

The above version doesn't check for errors. But it's guaranteed to terminate. If error handling is needed it might be enough to check feof(stdin) and ferror(stdin) at the end, or let the caller do it.

My original purpose was submitting solutions to programming problems at SPOJ, where the input is only whitespace and digits. So there is still room for improvement, namely the isdigit check.

static int parseint(void)
{
    int c, n;

    n = getchar_unlocked() - '0';
    while ((c = getchar_unlocked()) >= '0')
        n = 10*n + c-'0';

    return n;
}

It's very, very hard to beat this parsing routine, both performance-wise and in terms of convenience and maintainability.

回复收藏 0 原文

放飞的风筝 2024-12-27 19:09:25

您将能够通过缓冲显着改进您的示例 - 将大量字符读入内存，然后从内存版本中解析它们。

如果您从磁盘读取数据，则缓冲区大小为块大小的倍数可能会提高性能。

编辑：您可以使用让内核为您处理此问题mmap 将文件映射到内存中。

回复收藏 0 原文

打小就很酷 2024-12-27 19:09:25

这是我用的东西。

 #define scan(x) do{while((x=getchar())<'0'); for(x-='0'; '0'<=(_=getchar()); x=(x<<3)+(x<<1)+_-'0');}while(0)
 char _;

但是，这只适用于整数。

Here's something I use.

 #define scan(x) do{while((x=getchar())<'0'); for(x-='0'; '0'<=(_=getchar()); x=(x<<3)+(x<<1)+_-'0');}while(0)
 char _;

However, this only works with Integers.

回复收藏 0 原文

脱离于你 2024-12-27 19:09:25

根据你所说的，我得出以下事实：

数字在 0-99 范围内，它占 10+100 个不同的字符串（包括前导零），
你相信你的输入流遵循某种规范并且不会包含任何意外的字符序列

在这种情况下，我将使用查找表将字符串转换为数字。给定一个字符串 s[2]，查找表的索引可以通过 s[1]*10 + s[0] 计算，交换数字并利用 '\0' 等于 ASCII 中的 0。

然后，您可以通过以下方式读取您的输入：

// given our lookup method, this table may need padding entries
int lookup_table[] = { /*...*/ };

// no need to call superfluous functions
#define str2int(x) (lookup_table[(x)[1]*10 + (x)[0]])

while(read_token_from_stream(stdin, buf))
        next_int = str2int(buf);

在当今的机器上，很难想出更快的技术。我的猜测是，此方法的运行速度可能比任何基于 scanf() 的方法快 2 到 10 倍。

From what you say, I derive the following facts:

numbers are in the range of 0-99, which accounts for 10+100 different strings (including leading zeros)
you trust that your input stream adheres to some sort of specification and won't contain any unexpected character sequences

In that case, I'd use a lookup table to convert strings to numbers. Given a string s[2], the index to your lookup table can be computed by s[1]*10 + s[0], swapping the digits and making use of the fact that '\0' equals 0 in ASCII.

Then, you can read your input in the following way:

// given our lookup method, this table may need padding entries
int lookup_table[] = { /*...*/ };

// no need to call superfluous functions
#define str2int(x) (lookup_table[(x)[1]*10 + (x)[0]])

while(read_token_from_stream(stdin, buf))
        next_int = str2int(buf);

On today's machines, it will be hard to come up with a faster technique. My guess is that this method will likely run 2 to 10 times faster than any scanf()-based approach.

回复收藏 0 原文

~没有更多了~