高效输出数字
我想将用空格分隔的积分列表打印到标准输出。列表生成速度很快,所以我尝试用序列 [1..200000] 来解决这个问题。
在 C 中,我可以这样实现它:
#include "stdio.h"
int main()
{
int i;
for(i = 0; i <= 200000; ++i)
printf("%d ", i);
return 0;
}
我可以实现的 Haskell 中最快的解决方案大约慢三倍:
import Data.List (intercalate)
main = putStr . intercalate " " . map (show) $ [1..(200000)]
我以某些方式尝试了 ByteStrings,但使用它们它变得更慢。 大问题似乎是使用 show 将整数转换为字符串(或转换为 ByteStrings)。
有什么建议如何在不与 C 接口的情况下加快速度吗?它不应该变得太复杂(尽可能简短和美观,使用其他 Haskell 模块就可以了)。
I want to print a list of integrals separated with spaces to stdout. The list generation is fast, so I tried to solve this problem with the sequence [1..200000].
In C, I can implement it like this:
#include "stdio.h"
int main()
{
int i;
for(i = 0; i <= 200000; ++i)
printf("%d ", i);
return 0;
}
The fastest solution in Haskell I could implement is about three times slower:
import Data.List (intercalate)
main = putStr . intercalate " " . map (show) $ [1..(200000)]
I tried ByteStrings in some ways, but with them it got even slower.
The big problems seems to be the conversion of the integers to strings with show (or the conversion to ByteStrings).
Any suggestions how to speed this up without interfacing to C? It should not become to complicated (as short and beautiful as possible, using other Haskell modules is fine).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
好吧,您可以稍微重写一下代码:
然后您可以使用“ghc -O2 -prof -auto-all .hs”对其进行分析:
您可以看到插值占用了一半的运行时间。不过,我认为如果不诉诸一些低级的技巧,你无法让整个事情进展得更快。如果您切换到更快的插入(例如,从 Data.ByteString.Lazy.Char8),您将不得不使用 Int -> 的较慢变体。字符串转换。
Well, you could rewrite the code a bit:
Then you could profile it using "ghc -O2 -prof -auto-all .hs":
You can see that intercalate takes a good half of the runtime. I don't think that you could make the whole thing go faster, though, without resorting to some low-level trickery. If you switch to faster intercalate (from Data.ByteString.Lazy.Char8, for example), you would have to use a slower variant of Int -> String conversion.
如果我使用 ghc-6.10.4 而不是 ghc-6.12.1,这个程序运行得更快。 IIRC 6.12 系列引入了 unicode-aware IO,我认为这是导致速度下降的主要原因。
我的系统:
使用 ghc-6.10 时,结果与 C 相当;我认为其中的差异是由于 Haskell 对字符串的使用(也可能是运行时开销)。
我认为如果你想从该编译器获得更好的性能,可以绕过 ghc-6.12 的 I/O 中的 unicode 转换。
This program runs much faster if I use ghc-6.10.4 instead of ghc-6.12.1. IIRC the 6.12 line introduced unicode-aware IO, which I think accounts for a lot of the slowdown.
My system:
When using ghc-6.10 the result is pretty comparable to C; I think the difference there is due to Haskell's use of strings (and probably runtime overhead too).
I think it's possible to bypass the unicode conversion in ghc-6.12's I/O if you want to get better performance from that compiler.
第一个问题:
发布一些代码!
我猜(根据 delnan :),它很慢,因为发生了以下情况(如果不使用字节串,请跳过步骤 4):
pack
)使用字节串可能会更快,但您可能应该实现自己的
show
,它适用于字节串。然后,要聪明一点,避免多次遍历,在创建列表后输入空格。也许是这样的:
这是未经测试的,所以简介!你看, to 映射可以融合,所以列表可能会被遍历两次。
First question:
Post some code!!!
I guess (according to delnan :), that it's slow because the following happens (skip step 4if you don't use bytestring):
pack
)It could be faster with bytestring, but you should probably implement your own
show
, which works with bytestrings. Then, be so smart and avoid multiple traversion, input the whitespaces once the list is created.Maybe like this:
This is untested, so profile!!! You see, the to maps can be fused, so the list will be traversed maybe two times.
这是解决同一问题的不同方法,尝试利用字符串后缀中的共享。它在我的机器上比原来的 Haskell 快了大约 1/3,尽管不可否认,它与 C 版本还有很大差距。做 1 到 999999 以外的数字作为练习:
Here is a different approach to the same problem, that attempts to exploit the sharing in the string suffixes. It went about 1/3rd faster on my machine than the original Haskell, although admittedly still a way off the C version. Doing numbers other than 1 to 999999 is left as an exercise:
这个版本比你的好一点。我想这是改进它的一种方法。
This version does a bit better then yours. I guess it's one way to improve on it.