Haskell 中无限列表的编译器优化?
我有各种 t -> 类型的“部分排列”函数也许 t
要么通过返回 Just
将我带到数据结构中的新位置,要么返回 Nothing
(如果它们还无法到达那里)。
我通常必须以重复的特定模式应用这些部分排列,构建所有中间值的列表,但每当我返回到起始位置或排列失败时,就会截断该列表。
scan_partial_perms :: Eq t => [t -> Maybe t] -> t -> [t]
scan_partial_perms ps v = map fromJust . takeWhile test $ scanl (>>=) (Just v) ps
where test (Just i) | i /= v = True
test _ = False
iterate_partial_perm = scan_partial_perm . iterate
cycle_partial_perms = scan_partial_perms perms . cycle
我相当有信心 scanl
在这种情况下具有理想的严格性和尾递归属性。关于优化此代码还有其他技巧吗?特别是,除了 -O3 -fllvm
之外,我还应该阅读哪些编译器选项?
在最坏的情况下,我可以用定义的访问器函数替换 scanl 和无限列表,就像
perm l i = l !! i `rem` length l
我想象的那样,但这不能通过正确的优化来提高性能。
I've various "partial permutation" functions of type t -> Maybe t
that either take me to a new location in a data structure by returning a Just
or else return a Nothing
if they cannot yet get there.
I routinely must applying these partial permutations in repeated specific patterns, building a list of all intermediate values, but truncating the list whenever I return to my starting position or a permutation fails.
scan_partial_perms :: Eq t => [t -> Maybe t] -> t -> [t]
scan_partial_perms ps v = map fromJust . takeWhile test $ scanl (>>=) (Just v) ps
where test (Just i) | i /= v = True
test _ = False
iterate_partial_perm = scan_partial_perm . iterate
cycle_partial_perms = scan_partial_perms perms . cycle
I'm fairly confident that scanl
has the desirable strictness and tail recursion properties in this context. Any other tips on optimizing this code? In particular, what compiler options beyond -O3 -fllvm
should I read about?
At worst, I could replace the scanl
and infinite list with an accessor function defined like
perm l i = l !! i `rem` length l
I'd imagine this cannot improve performance with the right optimizations however.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为你在
scan_partial_perms
中有一个错误,scanl fs list
总是以s
开头,所以takeWhile test (scanl ...)
是[]
。如果这是故意的,那就太混乱了。假设你想要什么,你无能为力。您可以
{-# SPECIALIZE #-}
它,以便消除专门用于类型的Eq
字典。如果编译器不自行执行此操作(如果它可以看到使用站点,则可能会执行此操作),这会对您有一些好处。使用 ghc >= 7,您可以将其设置为{-# INLINABLE #-}
,这样它就可以专门化,并且可能在每个使用站点上内联。我不知道 llvm 道路上会发生什么,但在核心级别,
map
、fromJust
和takeWhile
尚未内联,因此,如果你足够绝望,如果稍后没有在 llvm 后端内联它们,你可以通过手动内联它们来获得百分之零点几:但这些都是非常便宜的函数,所以收益 - 如果有的话 -会很小。
所以你所拥有的已经相当不错了,如果还不够好,你就需要不同的攻击途径。
带有列表索引的那个
看起来不太好。
length
很昂贵,(!!)
也很昂贵,因此通常应该避免两者。I think you have a bug in
scan_partial_perms
,scanl f s list
always starts withs
, sotakeWhile test (scanl ...)
is[]
. If that is intentional, it's quite obfuscated. Assuming what you want isthere's not much you can do. You can
{-# SPECIALISE #-}
it so theEq
dictionary is eliminated for the specialised-for types. That'll do you some good if the compiler doesn't do that on its own (which it may if it can see the use site). With ghc >= 7, you can instead make it{-# INLINABLE #-}
, so that it can be specialised and perhaps inlined at each use site.I don't know what happens down the llvm road, but at the core-level,
map
,fromJust
andtakeWhile
are not yet inlined, so if you're desperate enough, you can get maybe a few tenths of a percent by inlining them manually if they aren't inlined later in the llvm backend:But those are very cheap functions, so the gains - if at all present - would be small.
So what you have is rather good already, if it's not good enough, you need a different route of attack.
The one with the list indexing,
doesn't look good.
length
is expensive,(!!)
is expensive too, so both should in general be avoided.