如何将 (StorableArray (Int, Int) Word8) 转换为惰性 ByteString?

发布于 2024-09-07 09:10:57 字数 907 浏览 2 评论 0原文

我正在尝试加载 PNG 文件,获取未压缩的 RGBA 字节,然后将它们发送到 gzip 或 zlib 包。

pngload 包将图像数据返回为 (StorableArray (Int, Int) Word8),压缩包采用惰性字节字符串。因此,我试图构建一个 (StorableArray (Int, Int) Word8 -> ByteString) 函数。

到目前为止,我已经尝试了以下操作:

import qualified Codec.Image.PNG as PNG
import Control.Monad (mapM)
import Data.Array.Storable (withStorableArray)
import qualified Data.ByteString.Lazy as LB (ByteString, pack, take)
import Data.Word (Word8)
import Foreign (Ptr, peekByteOff)

main = do
    -- Load PNG into "image"...
    bytes <- withStorableArray 
        (PNG.imageData image)
        (bytesFromPointer lengthOfImageData)

bytesFromPointer :: Int -> Ptr Word8 -> IO LB.ByteString
bytesFromPointer count pointer = LB.pack $ 
    mapM (peekByteOff pointer) [0..(count-1)]

这会导致堆栈内存不足,所以显然我做错了一些事情。我可以用 Ptr 和foreignPtr 尝试更多的事情,但是其中有很多“不安全”的函数。

如有任何帮助,我们将不胜感激;我很困惑。

I am trying to load a PNG file, get the uncompressed RGBA bytes, then send them to the gzip or zlib packages.

The pngload package returns image data as a (StorableArray (Int, Int) Word8), and the compression packages take lazy ByteStrings. Therefore, I am attempting to build a (StorableArray (Int, Int) Word8 -> ByteString) function.

So far, I have tried the following:

import qualified Codec.Image.PNG as PNG
import Control.Monad (mapM)
import Data.Array.Storable (withStorableArray)
import qualified Data.ByteString.Lazy as LB (ByteString, pack, take)
import Data.Word (Word8)
import Foreign (Ptr, peekByteOff)

main = do
    -- Load PNG into "image"...
    bytes <- withStorableArray 
        (PNG.imageData image)
        (bytesFromPointer lengthOfImageData)

bytesFromPointer :: Int -> Ptr Word8 -> IO LB.ByteString
bytesFromPointer count pointer = LB.pack $ 
    mapM (peekByteOff pointer) [0..(count-1)]

This causes the stack to run out of memory, so clearly I am doing something very wrong. There are more things I could try with Ptr's and ForeignPtr's, but there are a lot of "unsafe" functions in there.

Any help here would be appreciated; I'm fairly stumped.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

梦幻的心爱 2024-09-14 09:11:38

“bytesFromPointer”的问题在于,您采用了打包表示形式(来自 pngload 的 StorableArray),并且希望将其转换为另一个打包表示形式(ByteString),并通过中间列表。有时,惰性意味着不会在内存中构造中间列表,但这里不是这种情况。

函数“mapM”是首犯。如果展开mapM(peekByteOff指针)[0..(count-1)],你会得到

el0 <- peekByteOff pointer 0
el1 <- peekByteOff pointer 1
el2 <- peekByteOff pointer 2
...
eln <- peekByteOff pointer (count-1)
return [el0,el1,el2,...eln]

因为这些操作都发生在IO monad中,所以它们是按顺序执行的。这意味着必须在构造列表之前构造输出列表的每个元素,而懒惰永远没有机会帮助您。

即使列表是惰性构建的,正如 Don Stewart 指出的那样,“pack”函数仍然会破坏你的性能。 “pack”的问题在于它需要知道列表中有多少元素才能分配正确的内存量。为了找到列表的长度,程序需要遍历它到末尾。由于需要计算长度,因此需要先完全加载列表,然后才能将其打包为字节串。

我认为“mapM”和“pack”是一种代码味道。有时您可以将“mapM”替换为“mapM_”,但在这种情况下,最好使用字节串创建函数,例如“packCStringLen”。

The problem with your "bytesFromPointer" is that you take a packed representation, the StorableArray from pngload, and you want to convert it to another packed representation, a ByteString, going through an intermediate list. Sometimes laziness means that the intermediate list won't be constructed in memory, but that's not the case here.

The function "mapM" is the first offender. If you expand mapM (peekByteOff pointer) [0..(count-1)] you get

el0 <- peekByteOff pointer 0
el1 <- peekByteOff pointer 1
el2 <- peekByteOff pointer 2
...
eln <- peekByteOff pointer (count-1)
return [el0,el1,el2,...eln]

because these actions all occur within the IO monad, they are executed in order. This means every element of the output list must be constructed before the list is constructed and laziness never has a chance to help you.

Even if the list was constructed lazily, as Don Stewart points out the "pack" function will still ruin your performance. The problem with "pack" is that it needs to know how many elements are in the list to allocate the correct amount of memory. To find the length of a list, the program needs to traverse it to the end. Because of the necessity of calculating the length, the list will need to be entirely loaded before it can be packed into a bytestring.

I consider "mapM", along with "pack", to be a code smell. Sometimes you can replace "mapM" with "mapM_", but in this case it's better to use the bytestring creation functions, e.g. "packCStringLen".

属性 2024-09-14 09:11:33

一般来说,打包和解包对于性能来说是个坏主意。如果您有一个 Ptr 和一个以字节为单位的长度,则可以通过两种不同的方式生成严格的字节串:

像这样:

import qualified Codec.Image.PNG as PNG
import Control.Monad
import Data.Array.Storable (withStorableArray)

import Codec.Compression.GZip

import qualified Data.ByteString.Lazy   as L
import qualified Data.ByteString.Unsafe as S

import Data.Word
import Foreign

-- Pack a Ptr Word8 as a strict bytestring, then box it to a lazy one, very
-- efficiently
bytesFromPointer :: Int -> Ptr Word8 -> IO L.ByteString
bytesFromPointer n ptr = do
    s <- S.unsafePackCStringLen (castPtr ptr, n)
    return $! L.fromChunks [s]

-- Dummies, since they were not provided 
image = undefined
lengthOfImageData = 10^3

-- Load a PNG, and compress it, writing it back to disk
main = do
    bytes <- withStorableArray
        (PNG.imageData image)
        (bytesFromPointer lengthOfImageData)
    L.writeFile "foo" . compress $ bytes

我使用的是 O(1) 版本,它只是重新打包 Ptr来自StorableArray。您可能希望首先通过“packCStringLen”复制它。

Generally, pack and unpack are a bad idea for performance. If you have a Ptr, and a length in bytes, you can generate a strict bytestring in two different ways:

Like this:

import qualified Codec.Image.PNG as PNG
import Control.Monad
import Data.Array.Storable (withStorableArray)

import Codec.Compression.GZip

import qualified Data.ByteString.Lazy   as L
import qualified Data.ByteString.Unsafe as S

import Data.Word
import Foreign

-- Pack a Ptr Word8 as a strict bytestring, then box it to a lazy one, very
-- efficiently
bytesFromPointer :: Int -> Ptr Word8 -> IO L.ByteString
bytesFromPointer n ptr = do
    s <- S.unsafePackCStringLen (castPtr ptr, n)
    return $! L.fromChunks [s]

-- Dummies, since they were not provided 
image = undefined
lengthOfImageData = 10^3

-- Load a PNG, and compress it, writing it back to disk
main = do
    bytes <- withStorableArray
        (PNG.imageData image)
        (bytesFromPointer lengthOfImageData)
    L.writeFile "foo" . compress $ bytes

I'm using the O(1) version, that just repackages the Ptr from the StorableArray. You might wish to copy it first, via "packCStringLen".

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文