比 scanf 快吗?
我正在使用 scanf("%d", &someint)
对正整数进行大量解析。因为我想看看 scanf 是否是瓶颈,所以我使用 fread
实现了一个简单的整数解析函数,就像:
int result;
char c;
while (fread(&c, sizeof c, 1, stdin), c == ' ' || c == '\n')
;
result = c - '0';
while (fread(&c, sizeof c, 1, stdin), c >= '0' || c <= '9') {
result *= 10;
result += c - '0';
}
return result;
但令我惊讶的是,这个函数的性能(即使使用内联)约为 50%更糟。难道不应该有可能针对特殊情况改进 scanf 吗? fread
不是应该很快吗(附加提示:整数(编辑:主要)是 1 或 2 位数字)?
I was doing massive parsing of positive integers using scanf("%d", &someint)
. As I wanted to see if scanf was a bottleneck, I implemented a naive integer parsing function using fread
, just like:
int result;
char c;
while (fread(&c, sizeof c, 1, stdin), c == ' ' || c == '\n')
;
result = c - '0';
while (fread(&c, sizeof c, 1, stdin), c >= '0' || c <= '9') {
result *= 10;
result += c - '0';
}
return result;
But to my astonishment, the performance of this function is (even with inlining) about 50% worse. Shouldn't there be a possibility to improve on scanf for specialized cases? Isn't fread
supposed to be fast (additional hint: Integers are (edit: mostly) 1 or 2 digits)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我遇到的开销不是解析本身,而是对
fread
的多次调用(与fgetc
和朋友相同)。对于每次调用,libc 都必须锁定输入流,以确保两个线程不会互相干扰。锁定是一个非常昂贵的操作。我们正在寻找的是一个函数,它可以为我们提供缓冲输入(重新实现缓冲实在是太费力了),但又避免了 fgetc 的巨大锁定开销。
如果我们可以保证只有一个线程使用输入流,我们可以使用unlocked_stdio(3)中的函数,例如getchar_unlocked(3)。这是一个示例:
上面的版本不检查错误。但它肯定会终止。如果需要错误处理,那么在最后检查
feof(stdin)
和ferror(stdin)
可能就足够了,或者让调用者来做。我最初的目的是提交 SPOJ 编程问题的解决方案,其中输入只有空格和数字。所以还有改进的空间,即
isdigit
检查。无论是在性能方面还是在便利性和可维护性方面,都非常非常难以击败这个解析例程。
The overhead I was encountering was not the parsing itself but the many calls to
fread
(same withfgetc
, and friends). For each call, the libc has to lock the input stream to make sure two threads aren't stepping on each other's feet. Locking is a very costly operation.What we're looking for is a function that gives us buffered input (reimplementing buffering is just too much effort) but avoids the huge locking overhead of
fgetc
.If we can guarantee that there is only a single thread using the input stream, we can use the functions from
unlocked_stdio(3)
, such asgetchar_unlocked(3)
. Here is an example:The above version doesn't check for errors. But it's guaranteed to terminate. If error handling is needed it might be enough to check
feof(stdin)
andferror(stdin)
at the end, or let the caller do it.My original purpose was submitting solutions to programming problems at SPOJ, where the input is only whitespace and digits. So there is still room for improvement, namely the
isdigit
check.It's very, very hard to beat this parsing routine, both performance-wise and in terms of convenience and maintainability.
您将能够通过缓冲显着改进您的示例 - 将大量字符读入内存,然后从内存版本中解析它们。
如果您从磁盘读取数据,则缓冲区大小为块大小的倍数可能会提高性能。
编辑:您可以使用 让内核为您处理此问题mmap 将文件映射到内存中。
You'll be able to improve significantly on your example by buffering - read a large number of characters into memory, and then parse them from the in-memory version.
If you're reading from disk you might get a performance increase by your buffer being a multiple of the block size.
Edit: You can let the kernel handle this for you by using mmap to map the file into memory.
这是我用的东西。
但是,这只适用于整数。
Here's something I use.
However, this only works with Integers.
根据你所说的,我得出以下事实:
在这种情况下,我将使用查找表将字符串转换为数字。给定一个字符串 s[2],查找表的索引可以通过
s[1]*10 + s[0]
计算,交换数字并利用'\0'
等于 ASCII 中的0
。然后,您可以通过以下方式读取您的输入:
在当今的机器上,很难想出更快的技术。我的猜测是,此方法的运行速度可能比任何基于
scanf()
的方法快 2 到 10 倍。From what you say, I derive the following facts:
In that case, I'd use a lookup table to convert strings to numbers. Given a string s[2], the index to your lookup table can be computed by
s[1]*10 + s[0]
, swapping the digits and making use of the fact that'\0'
equals0
in ASCII.Then, you can read your input in the following way:
On today's machines, it will be hard to come up with a faster technique. My guess is that this method will likely run 2 to 10 times faster than any
scanf()
-based approach.