为什么将 Data.Binary.Put monad 更改为转换器会导致内存泄漏？

发布于 2024-10-21 06:55:06 字数 1979 浏览 6 评论 0原文

我正在尝试将 Data.Binary.PutM monad 修改为 monad 转换器。所以我开始将它的定义从

newtype PutM a = Put { unPut :: PairS a }

newtype PutM a = Put { unPut :: Identity (PairS a) }

更改然后当然我更改了 return 和 >>= 函数的实现：

From:

return a = Put $ PairS a mempty
{-# INLINE return #-}

m >>= k  = Put $
    let PairS a w  = unPut m
        PairS b w1 = unPut (k a)
    in PairS b (w `mappend` w1)
{-# INLINE (>>=) #-}

m >> k  = Put $
    let PairS _ w  = unPut m
        PairS b w1 = unPut k
    in PairS b (w `mappend` w1)
{-# INLINE (>>) #-}

To:

return a = Put $! return $! PairS a mempty
{-# INLINE return #-}

m >>= k  = Put $!
    do PairS a w  <- unPut m
       PairS b w1 <- unPut (k a)
       return $! PairS b $! (w `mappend` w1)
{-# INLINE (>>=) #-}

m >> k  = Put $!
    do PairS _ w  <- unPut m
       PairS b w1 <- unPut k
       return $! PairS b $! (w `mappend` w1)
{-# INLINE (>>) #-}

就好像 PutM monad 是只是一个 Writer monad。不幸的是，这（再次）创建了空间泄漏。我很清楚（或者是吗？） ghc 正在推迟评估某处，但我尝试按照一些教程的建议将 $! 而不是 $ 放在任何地方，但这确实没有帮助。另外，我不确定内存分析器是否有帮助，如果它向我显示的是：

Memory profile 。

为了完整起见，这是我使用原始 Data.Binary.Put monad 时获得的内存配置文件：

Original memory profile

If有兴趣，这里是我用来测试它的代码和我用来编译的行，运行并创建内存配置文件是：

ghc -auto-all -fforce-recomp -O2 --make test5.hs && ./test5 +RTS -hT && hp2ps -c test5.hp && okular test5.ps

我希望我不会因为我的内存泄漏问题而烦恼任何人。我发现互联网上关于这个主题的好资源并不多，这让新手一无所知。

感谢您的关注。

原文

I'm trying to modify the Data.Binary.PutM monad into a monad transformer. So I started by changin it's definition from

newtype PutM a = Put { unPut :: PairS a }

newtype PutM a = Put { unPut :: Identity (PairS a) }

Then of course I changed the implementations of return and >>= functions:

From:

return a = Put $ PairS a mempty
{-# INLINE return #-}

m >>= k  = Put $
    let PairS a w  = unPut m
        PairS b w1 = unPut (k a)
    in PairS b (w `mappend` w1)
{-# INLINE (>>=) #-}

m >> k  = Put $
    let PairS _ w  = unPut m
        PairS b w1 = unPut k
    in PairS b (w `mappend` w1)
{-# INLINE (>>) #-}

To:

return a = Put $! return $! PairS a mempty
{-# INLINE return #-}

m >>= k  = Put $!
    do PairS a w  <- unPut m
       PairS b w1 <- unPut (k a)
       return $! PairS b $! (w `mappend` w1)
{-# INLINE (>>=) #-}

m >> k  = Put $!
    do PairS _ w  <- unPut m
       PairS b w1 <- unPut k
       return $! PairS b $! (w `mappend` w1)
{-# INLINE (>>) #-}

As if the PutM monad was just a Writer monad. Unfortunately this (again) created a space leak. It is clear to me (or is it?) that ghc is postponing evaluation somewhere but I tried to put $! instead of $ everywhere as suggested by some tutorials but that did not help. Also, I'm not sure how the memory profiler is helpful if what it shows me is this:

Memory profile .

And for completeness, this is the memory profile I get when using the original Data.Binary.Put monad:

Original memory profile

If interested, here is the code I'm using to test it and the line I'm using to compile, run and create the memory profile is:

ghc -auto-all -fforce-recomp -O2 --make test5.hs && ./test5 +RTS -hT && hp2ps -c test5.hp && okular test5.ps

I hope I'm not annoying anyone by my saga of memory leak questions. I find there isn't many good resources on internet about this topic which leaves a newbye clueless.

Thanks for looking.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

我纯我任性 2024-10-28 06:55:06

正如stephen tetley在他的评论中指出的那样，这里的问题在于过于严格。如果您只是在 Identity 示例中添加更多惰性（(>>) 定义中的 ~(PairS b w')），您将得到相同的常量内存运行图片：

data PairS a = PairS a {-# UNPACK #-}!Builder

sndS :: PairS a -> Builder
sndS (PairS _ !b) = b

newtype PutM a = Put { unPut :: Identity (PairS a) }

type Put = PutM ()

instance Monad PutM where
    return a = Put $! return $! PairS a mempty
    {-# INLINE return #-}

    m >>= k  = Put $!
        do PairS a w  <- unPut m
           PairS b w' <- unPut (k a)
           return $! PairS b $! (w `mappend` w')
    {-# INLINE (>>=) #-}

    m >> k  = Put $!
        do PairS _ w  <- unPut m
           ~(PairS b w') <- unPut k
           return $! PairS b $! (w `mappend` w')
    {-# INLINE (>>) #-}

tell' :: Builder -> Put
tell' b = Put $! return $! PairS () b

runPut :: Put -> L.ByteString
runPut = toLazyByteString . sndS . runIdentity . unPut

您实际上可以在这里使用普通元组和 $ 而不是 $!

PS 再次强调：正确的答案实际上在 stephen tetley 中评论。问题是您的第一个示例使用 lazy let 绑定来实现 >> ，因此 Tree 是不强制完全构建，因此“是流式传输的”。您的第二个身份示例很严格，所以我的理解是整个 Tree 在处理之前会在内存中构建。实际上，您可以轻松地对第一个示例添加严格性，并观察它如何开始“占用”内存：

m >> k  = Put $
          case unPut m of
            PairS _ w ->
                case unPut k of
                  PairS b w' ->
                      PairS b (w `mappend` w')

As stephen tetley pointed out in his comment, the problem here is in excessive strictness. If you just add some more laziness to your Identity sample (~(PairS b w') in your (>>) definition) you'll get the same constant memory run picture:

data PairS a = PairS a {-# UNPACK #-}!Builder

sndS :: PairS a -> Builder
sndS (PairS _ !b) = b

newtype PutM a = Put { unPut :: Identity (PairS a) }

type Put = PutM ()

instance Monad PutM where
    return a = Put $! return $! PairS a mempty
    {-# INLINE return #-}

    m >>= k  = Put $!
        do PairS a w  <- unPut m
           PairS b w' <- unPut (k a)
           return $! PairS b $! (w `mappend` w')
    {-# INLINE (>>=) #-}

    m >> k  = Put $!
        do PairS _ w  <- unPut m
           ~(PairS b w') <- unPut k
           return $! PairS b $! (w `mappend` w')
    {-# INLINE (>>) #-}

tell' :: Builder -> Put
tell' b = Put $! return $! PairS () b

runPut :: Put -> L.ByteString
runPut = toLazyByteString . sndS . runIdentity . unPut

You actually can use normal tuples here and $ instead of $!

PS Once again: the right answer is actually in stephen tetley comment. The thing is that your 1st example uses lazy let bindings for >> implementation, so the Tree is not forced to be built entirely and hence "is streamed". Your 2nd Identity example is strict, so my understanding is that the whole Tree gets built in memory before being processed. You can actually easily add strictness to the 1st example and observe how it starts 'hogging' memory:

m >> k  = Put $
          case unPut m of
            PairS _ w ->
                case unPut k of
                  PairS b w' ->
                      PairS b (w `mappend` w')

回复收藏 0 原文

~没有更多了~