如何确保 Data.Vector 的摊销 O(n) 级联？

发布于 2024-12-12 02:28:14 字数 518 浏览 0 评论 0原文

我有一个应用程序，在其中使用向量作为代码的一部分是有效的。然而，在计算过程中我需要跟踪一些元素。我听说你可以从 Data.Vectors 获得 O(n) 摊销串联（通过通常的数组增长技巧），但我认为我做得不对。假设我们有以下设置：

import Data.Vector((++),Vector)
import Prelude hiding ((++))
import Control.Monad.State.Strict

data App = S (Vector Int)

add :: Vector Int -> State App ()
add v1 = modify (\S v2 -> S (v2 ++ v1))

在这种情况下，add 是否会在摊销 O(n) 时间内运行？如果没有，我怎样才能让 add 做到这一点（我需要在状态中存储 (forall s.MVector s Int) 吗？）。有没有更有效的方法来实现add？

原文

I have an application where it is efficient to use Vectors for one part of the code. However, during the computation I need to keep track of some of the elements. I have heard that you can get O(n) amortized concatenation from Data.Vectors (by the usual array growing trick) but I think that I am not doing it right. So lets say we have the following setup:

import Data.Vector((++),Vector)
import Prelude hiding ((++))
import Control.Monad.State.Strict

data App = S (Vector Int)

add :: Vector Int -> State App ()
add v1 = modify (\S v2 -> S (v2 ++ v1))

Does add run in amortized O(n) time in this case? If not, how can I make add do that (do I need to store an (forall s. MVector s Int) in the State?). Is there a more efficient way of implementing add?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

不回头走下去 2024-12-19 02:28:14

我也不太了解矢量库，所以我必须主要遵循一般原则。对它进行基准测试，运行一系列与您在应用程序中期望的类似的添加，并查看您获得的性能。如果它“足够好”，那就太好了，坚持使用简单的代码。如果没有，在状态中存储 (forall s.MVector s Int) 之前（你不能直接存储，元组不能保存 forall-types，所以你需要包装它），尝试通过转换为可变向量来改进添加行为，并在冻结之前执行串联以再次获得不可变向量，粗略地说，

add v1 = do
    S v2 <- get
    let v3 = runST $ do
                m1 <- unsafeThaw v2
                m2 <- unsafeGrow m1 (length v1)
                -- copy contents of v1 behind contents of v2
                unsafeFreeze m2
    put (S v3)

您可能需要在那里插入一些严格性。但是，如果 unsafeGrow 需要复制，则不能保证摊销 O(n) 行为。

则可以通过

存储状态中已用槽的数量
如果新向量适合最后的自由空间，
来获得摊销 O(n) 行为，如果它不适合自由空间，则解冻、复制、冻结而不增长，增长至少 2 倍，保证每个元素平均最多复制两次

I don't know the vector library well either, so I have to stick to general principles mostly. Benchmark it, run a sequence of adds similar to what you expect in your application and see what performance you get. If it's 'good enough', great, stick with the simple code. If not, before storing a (forall s. MVector s Int) in the state (which you can't directly, tuples can't hold forall-types, so you'd need to wrap it), try improving the add-behaviour by converting to mutable vectors and perform the concatenation there before freezing to get an immutable vector again, roughly

add v1 = do
    S v2 <- get
    let v3 = runST $ do
                m1 <- unsafeThaw v2
                m2 <- unsafeGrow m1 (length v1)
                -- copy contents of v1 behind contents of v2
                unsafeFreeze m2
    put (S v3)

You may need to insert some strictness there. However, if unsafeGrow needs to copy, that will not guarantee amortized O(n) behaviour.

You can get amortized O(n) behaviour by

storing the number of used slots in the state too
if the new vector fits in the free space at the end, thaw, copy, freeze without growing
if it doesn't fit in the free space, grow by at least a factor of 2, that guarantees that each element is copied at most twice on average

回复收藏 0 原文

~没有更多了~