Haskell 中动态规划的高效表

发布于 2024-10-20 17:33:15 字数 2393 浏览 2 评论 0原文

我已经用Haskell编写了0-1背包问题。我对迄今为止所取得的懒惰和普遍性水平感到相当自豪。

我首先提供用于创建和处理惰性二维矩阵的函数。

mkList f = map f [0..]
mkTable f = mkList (\i -> mkList (\j -> f i j))

tableIndex table i j = table !! i !! j

然后,我为给定的背包问题制作一个特定的表格

knapsackTable = mkTable f
    where f 0 _ = 0
          f _ 0 = 0
          f i j | ws!!i > j = leaveI
                | otherwise = max takeI leaveI
              where takeI  = tableIndex knapsackTable (i-1) (j-(ws!!i)) + vs!!i
                    leaveI = tableIndex knapsackTable (i-1) j

-- weight value pairs; item i has weight ws!!i and value vs!!i
ws  = [0,1,2, 5, 6, 7] -- weights
vs  = [0,1,7,11,21,31] -- values

,并使用几个辅助函数来查看该表格,

viewTable table maxI maxJ = take (maxI+1) . map (take (maxJ+1)) $ table
printTable table maxI maxJ = mapM_ print $ viewTable table maxI maxJ

这非常简单。但我想更进一步。

我想要一个更好的表数据结构。理想情况下,它应该是

  • 未装箱(不可变) [编辑]别介意这个
  • 惰性
  • 无界
  • O(1) 构造
  • 查找给定条目的O(1)时间复杂度,
    (更现实地说,最坏的情况是 O(log n),其中 n 是 i*j,用于查找第 i 行、j 列的条目)

如果您能解释一下,即可加分您的解决方案为何/如何满足这些理想。

如果您可以进一步概括 knapsackTable,并证明它是有效的,也会加分。

在改进数据结构时,您应该尝试满足以下目标:

  • 如果我要求最大权重为 10 的解决方案(在我当前的代码中,这将是 indexTable knapsackTable 5 10,5 意味着包括第 1-5 项),仅应执行最少量的必要工作。理想情况下,这意味着不需要 O(i*j) 来强制表格每行的主干达到必要的列长度。如果您认为 DP 意味着评估整个表,那么您可以说这不是“真正的”DP。
  • 如果我要求打印整个表格(例如 printTable knapsackTable 5 10),则每个条目的值应该计算一次且仅一次。给定单元格的值应取决于其他单元格的值(DP 风格:想法是,永远不要重新计算同一子问题两次)

想法:

对我所说的做出一些妥协的答案只要它们提供了丰富的信息,就会被投票赞成。妥协最少的答案可能是“被接受”的答案。

I've coded up the 0-1 Knapsack problem in Haskell. I'm fairly proud about the laziness and level of generality achieved so far.

I start by providing functions for creating and dealing with a lazy 2d matrix.

mkList f = map f [0..]
mkTable f = mkList (\i -> mkList (\j -> f i j))

tableIndex table i j = table !! i !! j

I then make a specific table for a given knapsack problem

knapsackTable = mkTable f
    where f 0 _ = 0
          f _ 0 = 0
          f i j | ws!!i > j = leaveI
                | otherwise = max takeI leaveI
              where takeI  = tableIndex knapsackTable (i-1) (j-(ws!!i)) + vs!!i
                    leaveI = tableIndex knapsackTable (i-1) j

-- weight value pairs; item i has weight ws!!i and value vs!!i
ws  = [0,1,2, 5, 6, 7] -- weights
vs  = [0,1,7,11,21,31] -- values

And finish off with a couple helper functions for looking at the table

viewTable table maxI maxJ = take (maxI+1) . map (take (maxJ+1)) $ table
printTable table maxI maxJ = mapM_ print $ viewTable table maxI maxJ

This much was pretty easy. But I want to take it a step further.

I want a better data structure for the table. Ideally, it should be

  • Unboxed (immutable) [edit] never mind this
  • Lazy
  • Unbounded
  • O(1) time to construct
  • O(1) time complexity for looking up a given entry,
    (more realistically, at worst O(log n), where n is i*j for looking up the entry at row i, column j)

Bonus points if you can explain why/how your solution satisfies these ideals.

Also bonus points if you can further generalize knapsackTable, and prove that it is efficient.

In improving the data structure you should try to satisfy the following goals:

  • If I ask for the solution where the maximum weight is 10 (in my current code, that would be indexTable knapsackTable 5 10, the 5 means include items 1-5) only the minimal amount of work necessary should be performed. Ideally this means no O(i*j) work for forcing the spine of each row of the table to necessary column length. You could say this isn't "true" DP, if you believe DP means evaluating the entirety of the table.
  • If I ask for the entire table to be printed (something like printTable knapsackTable 5 10), the values of each entry should be computed once and only once. The values of a given cell should depend on the values of other cells (DP style: the idea being, never recompute the same subproblem twice)

Ideas:

Answers that make some compromises to my stated ideals will be upvoted (by me, anyways) as long as they are informative. The answer with the least compromises will probably be the "accepted" one.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

祁梦 2024-10-27 17:33:15

首先,您对未装箱数据结构的标准可能有点误导。未装箱的值必须是严格的,并且它们与不变性无关。我要提出的解决方案是不可变的、惰性的和盒装的。另外,我不确定您希望以何种方式构建和查询为 O(1)。我提议的结构是惰性构建的,但由于它可能是无界的,因此其完整构建将需要无限的时间。对于任何大小为 k 的特定键,查询结构将花费 O(k) 时间,但当然您正在查找的值可能需要更多时间来计算。

数据结构是惰性字典树。我在代码中使用 Conal Elliott 的 MemoTrie 库。为了通用性,它使用函数而不是列表来表示权重和值。

knapsack :: (Enum a, Num w, Num v, Num a, Ord w, Ord v, HasTrie a, HasTrie w) =>
            (a -> w) -> (a -> v) -> a -> w -> v
knapsack weight value = knapsackMem
  where knapsackMem = memo2 knapsack'
        knapsack' 0 w = 0
        knapsack' i 0 = 0
        knapsack' i w
          | weight i > w = knapsackMem (pred i) w
          | otherwise = max (knapsackMem (pred i) w)
                        (knapsackMem (pred i) (w - weight i)) + value i

基本上,它是作为一个具有惰性脊柱和惰性值的特里树来实现的。它仅受密钥类型的限制。因为整个事情是惰性的,所以在使用查询强制执行之前的构造是 O(1)。每个查询都会强制沿着 trie 及其值使用一条路径,因此对于有界键大小,其时间复杂度为 O(1) O(log n)。正如我已经说过的,它是不可变的,但不是拆箱的。

它将分担递归调用中的所有工作。它实际上不允许您直接打印 trie,但是这样的事情不应该做任何多余的工作:

mapM_ (print . uncurry (knapsack ws vs)) $ range ((0,0), (i,w))

First, your criterion for an unboxed data structure is probably a bit mislead. Unboxed values must be strict, and they have nothing to do with immutability. The solution I'm going to propose is immutable, lazy, and boxed. Also, I'm not sure in what way you are wanting construction and querying to be O(1). The structure I'm proposing is lazily constructed, but because it's potentially unbounded, its full construction would take infinite time. Querying the structure will take O(k) time for any particular key of size k, but of course the value you're looking up may take further time to compute.

The data structure is a lazy trie. I'm using Conal Elliott's MemoTrie library in my code. For genericity, it takes functions instead of lists for the weights and values.

knapsack :: (Enum a, Num w, Num v, Num a, Ord w, Ord v, HasTrie a, HasTrie w) =>
            (a -> w) -> (a -> v) -> a -> w -> v
knapsack weight value = knapsackMem
  where knapsackMem = memo2 knapsack'
        knapsack' 0 w = 0
        knapsack' i 0 = 0
        knapsack' i w
          | weight i > w = knapsackMem (pred i) w
          | otherwise = max (knapsackMem (pred i) w)
                        (knapsackMem (pred i) (w - weight i)) + value i

Basically, it's implemented as a trie with a lazy spine and lazy values. It's bounded only by the key type. Because the entire thing is lazy, its construction before forcing it with queries is O(1). Each query forces a single path down the trie and its value, so it's O(1) for a bounded key size O(log n). As I already said, it's immutable, but not unboxed.

It will share all work in the recursive calls. It doesn't actually allow you to print the trie directly, but something like this should not do any redundant work:

mapM_ (print . uncurry (knapsack ws vs)) $ range ((0,0), (i,w))
归途 2024-10-27 17:33:15

Unboxed 意味着严格和有界。任何 100% 未装箱的东西都不能是惰性的或无限制的。通常的妥协体现在将 [Word8] 转换为 Data.ByteString.Lazy,其中存在未装箱的块(严格的 ByteString),它们以无限制的方式延迟链接在一起。

可以使用“scanl”、“zipWith”和我的“takeOnto”制作更高效的表生成器(增强以跟踪单个项目)。这有效地避免了在创建表时使用 (!!):

import Data.List(sort,genericTake)

type Table = [ [ Entry ] ]

data Entry = Entry { bestValue :: !Integer, pieces :: [[WV]] }
  deriving (Read,Show)

data WV = WV { weight, value :: !Integer }
  deriving (Read,Show,Eq,Ord)

instance Eq Entry where
  (==) a b = (==) (bestValue a) (bestValue b)

instance Ord Entry where
  compare a b = compare (bestValue a) (bestValue b)

solutions :: Entry -> Int
solutions = length . filter (not . null) . pieces

addItem :: Entry -> WV -> Entry
addItem e wv = Entry { bestValue = bestValue e + value wv, pieces = map (wv:) (pieces e) }

-- Utility function for improve
takeOnto :: ([a] -> [a]) -> Integer -> [a] -> [a]
takeOnto endF = go where
  go n rest | n <=0 = endF rest
            | otherwise = case rest of
                            (x:xs) -> x : go (pred n) xs
                            [] -> error "takeOnto: unexpected []"

improve oldList wv@(WV {weight=wi,value = vi}) = newList where
  newList | vi <=0 = oldList
          | otherwise = takeOnto (zipWith maxAB oldList) wi oldList
  -- Dual traversal of index (w-wi) and index w makes this a zipWith
  maxAB e2 e1 = let e2v = addItem e2 wv
                in case compare e1 e2v of
                     LT -> e2v
                     EQ -> Entry { bestValue = bestValue e1
                                 , pieces = pieces e1 ++ pieces e2v }
                     GT -> e1

-- Note that the returned table is finite
-- The dependence on only the previous row makes this a "scanl" operation
makeTable :: [Int] -> [Int] -> Table
makeTable ws vs =
  let wvs = zipWith WV (map toInteger ws) (map toInteger vs)
      nil = repeat (Entry { bestValue = 0, pieces = [[]] })
      totW = sum (map weight wvs)
  in map (genericTake (succ totW)) $ scanl improve nil wvs

-- Create specific table, note that weights (1+7) equal weight 8
ws, vs :: [Int]
ws  = [2,3, 5, 5, 6, 7] -- weights
vs  = [1,7,8,11,21,31] -- values

t = makeTable ws vs

-- Investigate table

seeTable = mapM_ seeBestValue t
  where seeBestValue row = mapM_ (\v -> putStr (' ':(show (bestValue v)))) row >> putChar '\n'

ways = mapM_ seeWays t
  where seeWays row = mapM_ (\v -> putStr (' ':(show (solutions v)))) row >> putChar '\n'

-- This has two ways of satisfying a bestValue of 8 for 3 items up to total weight 5
interesting = print (t !! 3 !! 5) 

Unboxed implies strict and bounded. Anything 100% Unboxed cannot be Lazy or Unbounded. The usual compromise is embodied in converting [Word8] to Data.ByteString.Lazy where there are unboxed chunks (strict ByteString) which are linked lazily together in an unbounded way.

A much more efficient table generator (enhanced to track individual items) could be made using "scanl", "zipWith", and my "takeOnto". This effectively avoid using (!!) while creating the table:

import Data.List(sort,genericTake)

type Table = [ [ Entry ] ]

data Entry = Entry { bestValue :: !Integer, pieces :: [[WV]] }
  deriving (Read,Show)

data WV = WV { weight, value :: !Integer }
  deriving (Read,Show,Eq,Ord)

instance Eq Entry where
  (==) a b = (==) (bestValue a) (bestValue b)

instance Ord Entry where
  compare a b = compare (bestValue a) (bestValue b)

solutions :: Entry -> Int
solutions = length . filter (not . null) . pieces

addItem :: Entry -> WV -> Entry
addItem e wv = Entry { bestValue = bestValue e + value wv, pieces = map (wv:) (pieces e) }

-- Utility function for improve
takeOnto :: ([a] -> [a]) -> Integer -> [a] -> [a]
takeOnto endF = go where
  go n rest | n <=0 = endF rest
            | otherwise = case rest of
                            (x:xs) -> x : go (pred n) xs
                            [] -> error "takeOnto: unexpected []"

improve oldList wv@(WV {weight=wi,value = vi}) = newList where
  newList | vi <=0 = oldList
          | otherwise = takeOnto (zipWith maxAB oldList) wi oldList
  -- Dual traversal of index (w-wi) and index w makes this a zipWith
  maxAB e2 e1 = let e2v = addItem e2 wv
                in case compare e1 e2v of
                     LT -> e2v
                     EQ -> Entry { bestValue = bestValue e1
                                 , pieces = pieces e1 ++ pieces e2v }
                     GT -> e1

-- Note that the returned table is finite
-- The dependence on only the previous row makes this a "scanl" operation
makeTable :: [Int] -> [Int] -> Table
makeTable ws vs =
  let wvs = zipWith WV (map toInteger ws) (map toInteger vs)
      nil = repeat (Entry { bestValue = 0, pieces = [[]] })
      totW = sum (map weight wvs)
  in map (genericTake (succ totW)) $ scanl improve nil wvs

-- Create specific table, note that weights (1+7) equal weight 8
ws, vs :: [Int]
ws  = [2,3, 5, 5, 6, 7] -- weights
vs  = [1,7,8,11,21,31] -- values

t = makeTable ws vs

-- Investigate table

seeTable = mapM_ seeBestValue t
  where seeBestValue row = mapM_ (\v -> putStr (' ':(show (bestValue v)))) row >> putChar '\n'

ways = mapM_ seeWays t
  where seeWays row = mapM_ (\v -> putStr (' ':(show (solutions v)))) row >> putChar '\n'

-- This has two ways of satisfying a bestValue of 8 for 3 items up to total weight 5
interesting = print (t !! 3 !! 5) 
长途伴 2024-10-27 17:33:15

惰性可存储向量:http://hackage.haskell.org/package/storablevector

无界,惰性, O(chunksize) 构建时间,O(n/chunksize) 索引,其中 chunksize 对于任何给定目的都可以足够大。基本上是一个具有一些显着的常数因子优点的惰性列表。

Lazy storable vectors: http://hackage.haskell.org/package/storablevector

Unbounded, lazy, O(chunksize) time to construct, O(n/chunksize) indexing, where chunksize can be sufficiently large for any given purpose. Basically a lazy list with some significant constant factor benifits.

霊感 2024-10-27 17:33:15

为了记忆函数,我推荐像 Luke Palmer 的 memo Combinators 这样的库。该库使用 try,它是无界的并且具有 O(key size) 查找。 (一般来说,你不能比 O(key size) 查找更好,因为你总是必须触及密钥的每一位。)

knapsack :: (Int,Int) -> Solution
knapsack = memo f
    where
    memo    = pair integral integral
    f (i,j) = ... knapsack (i-b,j) ...

在内部,integral 组合器可能会构建一个无限数据结构

data IntTrie a = Branch IntTrie a IntTrie

integral f = \n -> lookup n table
     where
     table = Branch (\n -> f (2*n)) (f 0) (\n -> f (2*n+1))

Lookup 的工作方式如下:

lookup 0 (Branch l a r) = a
lookup n (Branch l a r) = if even n then lookup n2 l else lookup n2 r
     where n2 = n `div` 2

还有其他方法可以构建无限尝试,但这种方法很流行。

To memoize functions, I recommend a library like Luke Palmer's memo combinators. The library uses tries, which are unbounded and have O(key size) lookup. (In general, you can't do better than O(key size) lookup because you always have to touch every bit of the key.)

knapsack :: (Int,Int) -> Solution
knapsack = memo f
    where
    memo    = pair integral integral
    f (i,j) = ... knapsack (i-b,j) ...

Internally, the integral combinator probably builds an infinite data structure

data IntTrie a = Branch IntTrie a IntTrie

integral f = \n -> lookup n table
     where
     table = Branch (\n -> f (2*n)) (f 0) (\n -> f (2*n+1))

Lookup works like this:

lookup 0 (Branch l a r) = a
lookup n (Branch l a r) = if even n then lookup n2 l else lookup n2 r
     where n2 = n `div` 2

There are other ways to build infinite tries, but this one is popular.

温柔嚣张 2024-10-27 17:33:15

为什么不使用 Data.Map 将另一个 Data.Map 放入其中?据我所知,速度相当快。
但也不会偷懒。

不仅如此,您还可以为数据实现 Ord 类型类

data Index = Index Int Int

,并直接将二维索引作为键。您可以通过将此地图生成为列表来实现惰性,然后只需使用

fromList [(Index 0 0, value11), (Index 0 1, value12), ...] 

Why won't you use Data.Map putting the other Data.Map into it? As far as I know it's quite fast.
It wouldn't be lazy though.

More than that, you can implement Ord typeclass for you data

data Index = Index Int Int

and put a two dimensional index directly as a key. You can achieve laziness by generating this map as a list and then just use

fromList [(Index 0 0, value11), (Index 0 1, value12), ...] 
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文