高效输出数字

发布于 2024-10-04 09:20:43 字数 542 浏览 2 评论 0原文

我想将用空格分隔的积分列表打印到标准输出。列表生成速度很快，所以我尝试用序列 [1..200000] 来解决这个问题。

在 C 中，我可以这样实现它：

#include "stdio.h"
int main()
{
  int i;
  for(i = 0; i <= 200000; ++i)
    printf("%d ", i);
  return 0;
}

我可以实现的 Haskell 中最快的解决方案大约慢三倍：

import Data.List (intercalate)
main = putStr . intercalate " " . map (show) $ [1..(200000)]

我以某些方式尝试了 ByteStrings，但使用它们它变得更慢。大问题似乎是使用 show 将整数转换为字符串（或转换为 ByteStrings）。

有什么建议如何在不与 C 接口的情况下加快速度吗？它不应该变得太复杂（尽可能简短和美观，使用其他 Haskell 模块就可以了）。

原文

I want to print a list of integrals separated with spaces to stdout. The list generation is fast, so I tried to solve this problem with the sequence [1..200000].

In C, I can implement it like this:

#include "stdio.h"
int main()
{
  int i;
  for(i = 0; i <= 200000; ++i)
    printf("%d ", i);
  return 0;
}

The fastest solution in Haskell I could implement is about three times slower:

import Data.List (intercalate)
main = putStr . intercalate " " . map (show) $ [1..(200000)]

I tried ByteStrings in some ways, but with them it got even slower.
The big problems seems to be the conversion of the integers to strings with show (or the conversion to ByteStrings).

Any suggestions how to speed this up without interfacing to C? It should not become to complicated (as short and beautiful as possible, using other Haskell modules is fine).

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

满身野味 2024-10-11 09:20:43

好吧，您可以稍微重写一下代码：

import Data.List (intercalate)
main = output
output = putStr one_string
one_string = intercalate " " strings
strings = map show $ [1..2000000]

然后您可以使用“ghc -O2 -prof -auto-all .hs”对其进行分析：

COST CENTRE                    MODULE               %time %alloc

one_string                     Main                  42.2   55.9
strings                        Main                  39.2   43.1
output                         Main                  18.6    1.0

您可以看到插值占用了一半的运行时间。不过，我认为如果不诉诸一些低级的技巧，你无法让整个事情进展得更快。如果您切换到更快的插入（例如，从 Data.ByteString.Lazy.Char8），您将不得不使用 Int -> 的较慢变体。字符串转换。

Well, you could rewrite the code a bit:

import Data.List (intercalate)
main = output
output = putStr one_string
one_string = intercalate " " strings
strings = map show $ [1..2000000]

Then you could profile it using "ghc -O2 -prof -auto-all .hs":

COST CENTRE                    MODULE               %time %alloc

one_string                     Main                  42.2   55.9
strings                        Main                  39.2   43.1
output                         Main                  18.6    1.0

You can see that intercalate takes a good half of the runtime. I don't think that you could make the whole thing go faster, though, without resorting to some low-level trickery. If you switch to faster intercalate (from Data.ByteString.Lazy.Char8, for example), you would have to use a slower variant of Int -> String conversion.

回复收藏 0 原文

在你怀里撒娇 2024-10-11 09:20:43

如果我使用 ghc-6.10.4 而不是 ghc-6.12.1，这个程序运行得更快。 IIRC 6.12 系列引入了 unicode-aware IO，我认为这是导致速度下降的主要原因。

我的系统：

C  (gcc -O2):        0.141s
HS (ghc-6.10.4 -O2): 0.191s (ave.)
HS (ghc-6.12.1 -O2): 0.303 (ave.)

使用 ghc-6.10 时，结果与 C 相当；我认为其中的差异是由于 Haskell 对字符串的使用（也可能是运行时开销）。

我认为如果你想从该编译器获得更好的性能，可以绕过 ghc-6.12 的 I/O 中的 unicode 转换。

This program runs much faster if I use ghc-6.10.4 instead of ghc-6.12.1. IIRC the 6.12 line introduced unicode-aware IO, which I think accounts for a lot of the slowdown.

My system:

C  (gcc -O2):        0.141s
HS (ghc-6.10.4 -O2): 0.191s (ave.)
HS (ghc-6.12.1 -O2): 0.303 (ave.)

When using ghc-6.10 the result is pretty comparable to C; I think the difference there is due to Haskell's use of strings (and probably runtime overhead too).

I think it's possible to bypass the unicode conversion in ghc-6.12's I/O if you want to get better performance from that compiler.

回复收藏 0 原文

叹沉浮 2024-10-11 09:20:43

第一个问题：

发布一些代码！

我猜（根据 delnan :)，它很慢，因为发生了以下情况（如果不使用字节串，请跳过步骤 4）：

所有数字都一一转换。输出是一个列表。
必须再次遍历输出列表，因为您添加了元素（空格！）。
必须再次遍历列表，因为您连接了它。
必须遍历列表。 em>再次，因为它被转换为字节串（pack）
整个内容被打印出来。

使用字节串可能会更快，但您可能应该实现自己的 show，它适用于字节串。然后，要聪明一点，避免多次遍历，在创建列表后输入空格。

也许是这样的：

import qualified Data.Bytestring.Lazy.Char8 as B

showB :: Int -> Bytestring -- Left as an exercise to the reader

main = B.putStr $ pipeline [0..20000] where
  pipeline = B.tail . B.concat . map (B.cons' ' ') . map showB

这是未经测试的，所以简介！你看， to 映射可以融合，所以列表可能会被遍历两次。

First question:

Post some code!!!

I guess (according to delnan :), that it's slow because the following happens (skip step 4if you don't use bytestring):

All the numbers are one by one converted. The output is a list.
The list of outputs have to be traversed again because you add elements (the spaces!)
The list have to be traversed again because you concat it
The list has to be traversed again because it is converted to bytestring (pack)
The whole thing is printed out.

It could be faster with bytestring, but you should probably implement your own show, which works with bytestrings. Then, be so smart and avoid multiple traversion, input the whitespaces once the list is created.

Maybe like this:

import qualified Data.Bytestring.Lazy.Char8 as B

showB :: Int -> Bytestring -- Left as an exercise to the reader

main = B.putStr $ pipeline [0..20000] where
  pipeline = B.tail . B.concat . map (B.cons' ' ') . map showB

This is untested, so profile!!! You see, the to maps can be fused, so the list will be traversed maybe two times.

回复收藏 0 原文

莫言歌 2024-10-11 09:20:43

这是解决同一问题的不同方法，尝试利用字符串后缀中的共享。它在我的机器上比原来的 Haskell 快了大约 1/3，尽管不可否认，它与 C 版本还有很大差距。做 1 到 999999 以外的数字作为练习：

basic :: [Char]
basic = ['0'..'9']

strip :: String -> String
strip = (' ' :) . dropWhile (== '0')

numbers :: Int -> [String]
numbers 0 = [""]
numbers n = [x : xs | x <- basic, xs <- rest]
  where
    rest = numbers (n - 1)

main = mapM_ (putStr . strip) (tail $ numbers 6)

Here is a different approach to the same problem, that attempts to exploit the sharing in the string suffixes. It went about 1/3rd faster on my machine than the original Haskell, although admittedly still a way off the C version. Doing numbers other than 1 to 999999 is left as an exercise:

basic :: [Char]
basic = ['0'..'9']

strip :: String -> String
strip = (' ' :) . dropWhile (== '0')

numbers :: Int -> [String]
numbers 0 = [""]
numbers n = [x : xs | x <- basic, xs <- rest]
  where
    rest = numbers (n - 1)

main = mapM_ (putStr . strip) (tail $ numbers 6)

回复收藏 0 原文

蓝天 2024-10-11 09:20:43

这个版本比你的好一点。我想这是改进它的一种方法。

showWithSpaces        :: (Show a) => [a] -> ShowS
showWithSpaces []     = showString ""
showWithSpaces (x:xs) = shows x . showChar ' ' . showWithSpaces xs

main = putStrLn $ showWithSpaces [1..2000000] $ ""

This version does a bit better then yours. I guess it's one way to improve on it.

showWithSpaces        :: (Show a) => [a] -> ShowS
showWithSpaces []     = showString ""
showWithSpaces (x:xs) = shows x . showChar ' ' . showWithSpaces xs

main = putStrLn $ showWithSpaces [1..2000000] $ ""

回复收藏 0 原文

~没有更多了~