如何实现惰性常量空间三分区函数?

发布于 2025-01-01 11:24:34 字数 1621 浏览 3 评论 0原文

我已将现有的 Data.List.partition 实现概括

partition :: (a -> Bool) -> [a] -> ([a],[a])
partition p xs = foldr (select p) ([],[]) xs
  where
    -- select :: (a -> Bool) -> a -> ([a], [a]) -> ([a], [a])
    select p x ~(ts,fs) | p x       = (x:ts,fs)
                        | otherwise = (ts, x:fs)

为“三分区”函数

ordPartition :: (a -> Ordering) -> [a] -> ([a],[a],[a])
ordPartition cmp xs = foldr select ([],[],[]) xs
  where
    -- select :: a -> ([a], [a], [a]) -> ([a], [a], [a])
    select x ~(lts,eqs,gts) = case cmp x of
        LT -> (x:lts,eqs,gts)
        EQ -> (lts,x:eqs,gts)
        GT -> (lts,eqs,x:gts)

,但现在在使用 ghc -O1 进行编译时遇到了令人困惑的行为, 'foo' 和 'bar' 函数在常量空间中工作,但 doo 函数会导致空间泄漏。

foo xs = xs1
  where
    (xs1,_,_) = ordPartition (flip compare 0) xs

bar xs = xs2
  where
    (_,xs2,_) = ordPartition (flip compare 0) xs

-- pass-thru "least" non-empty partition
doo xs | null xs1  = if null xs2 then xs3 else xs2
       | otherwise = xs1
  where
    (xs1,xs2,xs3) = ordPartition (flip compare 0) xs


main :: IO ()
main = do
    print $ foo [0..100000000::Integer] -- results in []
    print $ bar [0..100000000::Integer] -- results in [0]
    print $ doo [0..100000000::Integer] -- results in [0] with space-leak

所以我现在的问题是,

  1. doo 中空间泄漏的原因是什么,这对我来说似乎令人惊讶,因为 foobar 不会出现这样的空间泄漏吗?和

  2. 有没有一种方法可以实现ordPartition,使得当在doo等函数的上下文中使用时,它以恒定的空间复杂度执行?

I have generalized the existing Data.List.partition implementation

partition :: (a -> Bool) -> [a] -> ([a],[a])
partition p xs = foldr (select p) ([],[]) xs
  where
    -- select :: (a -> Bool) -> a -> ([a], [a]) -> ([a], [a])
    select p x ~(ts,fs) | p x       = (x:ts,fs)
                        | otherwise = (ts, x:fs)

to a "tri-partition" function

ordPartition :: (a -> Ordering) -> [a] -> ([a],[a],[a])
ordPartition cmp xs = foldr select ([],[],[]) xs
  where
    -- select :: a -> ([a], [a], [a]) -> ([a], [a], [a])
    select x ~(lts,eqs,gts) = case cmp x of
        LT -> (x:lts,eqs,gts)
        EQ -> (lts,x:eqs,gts)
        GT -> (lts,eqs,x:gts)

But now I'm facing a confusing behaviour when compiling with ghc -O1, the 'foo' and 'bar' functions work in constant-space, but the doo function leads to a space-leak.

foo xs = xs1
  where
    (xs1,_,_) = ordPartition (flip compare 0) xs

bar xs = xs2
  where
    (_,xs2,_) = ordPartition (flip compare 0) xs

-- pass-thru "least" non-empty partition
doo xs | null xs1  = if null xs2 then xs3 else xs2
       | otherwise = xs1
  where
    (xs1,xs2,xs3) = ordPartition (flip compare 0) xs


main :: IO ()
main = do
    print $ foo [0..100000000::Integer] -- results in []
    print $ bar [0..100000000::Integer] -- results in [0]
    print $ doo [0..100000000::Integer] -- results in [0] with space-leak

So my question now is,

  1. What is the reason for the space-leak in doo, which seems suprising to me, since foo and bar don't exhibit such a space leak? and

  2. Is there a way to implement ordPartition in such a way, that when used in the context of functions such as doo it performs with constant space complexity?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

蓬勃野心 2025-01-08 11:24:34

这不是空间泄漏。要查明组件列表是否为空,必须遍历整个输入列表,如果是,则构造其他组件列表(作为 thunk)。在 doo 情况下,xs1 是空的,因此在决定输出什么之前必须构建整个内容。

这是所有分区算法的基本属性,如果其中一个结果为空,并且您检查其是否为空作为条件,则在遍历整个列表之前无法完成该检查。

It's not a space leak. To find out whether a component list is empty, the entire input list has to be traversed and the other component lists constructed (as thunks) if it is. In the doo case, xs1 is empty, so the entire thing has to be built before deciding what to output.

That is a fundamental property of all partitioning algorithms, if one of the results is empty, and you check for its emptiness as a condition, that check cannot be completed before the entire list has been traversed.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文