我是否可以始终将仅可变算法转换为单赋值并且仍然高效？

发布于 2024-11-27 06:00:33 字数 915 浏览 10 评论 0原文

背景

这个问题的背景是我想玩一下基因表达编程（GEP），一种进化算法，使用 Erlang。 GEP 使用基于字符串的 DSL，称为“Karva 表示法”。 Karva 表示法很容易翻译成表达式解析树，但翻译算法假设有一个实现具有可变对象：不完整的子表达式在翻译过程的早期创建，并且它们自己的子表达式稍后用创建时未知的值填充。

Karva 表示法的目的是保证创建语法正确的表达式，而无需任何昂贵的编码技术或遗传密码校正。问题是，使用像 Erlang 这样的单赋值编程语言，我必须重新创建< /a> 随着每个子表达式的填充而不断地生成表达式树。这需要花费很少的时间 - O(n)？ - 更新操作并将其转换为将在指数时间内完成的操作（除非我弄错了）。如果我找不到一种有效的函数算法将 K 表达式转换为表达式树，那么 GEP 的一个引人注目的功能就会丢失。

我意识到 K 表达式翻译问题

相当晦涩，所以我想要的是关于如何将本质上非功能性算法（利用可变数据结构的 alg）转换为非功能性算法的建议。纯函数式编程语言如何适应计算机科学早期产生的许多算法和数据结构，这些算法和数据结构依赖于可变性来获得所需的性能特征？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

﹂绝世的画 2024-12-04 06:00:33

精心设计的不变性避免了不必要的更新

不可变数据结构只有在不断变化或者以错误的方式构建它们时才会出现效率问题。例如，不断向不断增长的列表的末尾添加更多内容是二次的，而连接列表的列表是线性的。如果你仔细思考，你通常可以以合理的方式构建你的结构，而懒惰的评估是你的朋友 - 承诺解决它并停止担心。

盲目地尝试复制命令式算法可能是低效的，但是您错误地断言函数式编程在这里必须是渐近糟糕的。

案例研究：纯函数 GEP：线性时间中的 Karva 表示法

我将继续解析 GEP 的 Karva 表示法的案例研究。（
我在这个答案中更全面地研究了这个解决方案。）

这是一个相当干净的纯函数式解决方案。我将借此机会列出一些好的通用递归方案。

代码

（导入Data.Tree提供data Tree a = Node {rootLabel :: a，subForest :: Forest a}，其中type Forest a = [Tree a].)

import Data.Tree
import Data.Tree.Pretty -- from the pretty-tree package for visualising trees

arity :: Char -> Int
arity c 
  | c `elem` "+*-/" = 2
  | c `elem` "Q" = 1
  | otherwise = 0

正态是变形（build up、unfoldr）和变形（combine、foldr）的组合。
这些术语是在开创性论文函数式编程中向 FP 社区介绍的与香蕉、镜片和铁丝网。

我们将把关卡拉出来（ana/展开）并将它们重新组合在一起（cata/折叠）。

hylomorphism :: b -> (a -> b -> b) -> (c -> (a, c)) -> (c -> Bool) -> c -> b
hylomorphism base combine pullout stop seed = hylo seed where
 hylo s | stop s = base
        | otherwise = combine new (hylo s') 
          where (new,s') = pullout s

为了拉出一个级别，我们使用前一个级别的总数量来找到在哪里分割这个新级别，并传递这个级别的总数量，为下一次做好准备：

pullLevel :: (Int,String) -> (String,(Int,String))
pullLevel (n,cs) = (level,(total, cs')) where
                   (level,        cs') = splitAt n cs
                   total = sum $ map arity level

将一个级别（作为字符串）与下面的级别（已经是森林），我们只需计算每个角色所需的树木数量即可。

combineLevel :: String -> Forest Char -> Forest Char
combineLevel "" [] = []
combineLevel (c:cs) levelBelow = Node c subforest : combineLevel cs theRest 
      where (subforest,theRest) = splitAt (arity c) levelBelow

现在我们可以使用hylomorphism 来解析Karva。请注意，我们使用 1 字符串外部的总元数作为种子，因为根级别只有一个节点。相应地，我们将 head 应用于结果，以在水态之后恢复该单例。

karvaToTree :: String -> Tree Char
karvaToTree cs = let
  zero (n,_) = n == 0          
    in head $ hylomorphism [] combineLevel pullLevel zero (1,cs)

线性时间

没有指数爆炸，也没有重复的 O(log(n)) 查找或昂贵的修改，所以我们不应该遇到太多麻烦。

数量为O(1)
splitAt部分为O(部分)
pullLevel (部分,cs) 是 O(part)，使用 splitAt 获取 level，再加上 O(part)对于地图数量级别，所以 O(part)
对于 splitAt， combineLevel (c:cs) 是 O(arity c)，并且 O (sum $map arity cs) 用于递归调用
hylomorphism []combineLevel pullLevel 0 (1,cs)
- 为每个级别进行一次 pullLevel 调用，因此总 pullLevel 成本为 O(sum parts) = O(n)
- 为每个级别进行一次 combineLevel 调用，因此总 combineLevel 成本为 O(sum $map arity level) = O(n ），因为有效字符串的整个输入的总数量受 n 限制。
- 对 0 进行 O(#levels) 次调用（即 O(1)），并且 #levels 受 绑定>n，所以它也低于 O(n)
因此 karvaToTree 与输入的长度呈线性关系。

我认为这可以消除您需要使用可变性来获得线性算法的断言。

演示

让我们绘制一下结果（因为 Tree 充满语法，很难阅读输出！）。您必须cabal install Pretty-tree才能获取Data.Tree.Pretty。

see :: Tree Char -> IO ()
see = putStrLn.drawVerticalTree.fmap (:"")

ghci> karvaToTree "Q/a*+b-cbabaccbac"
Node {rootLabel = 'Q', subForest = [Node {rootLabel = '/', subForest = [Node {rootLabel = 'a', subForest = []},Node {rootLabel = '*', subForest = [Node {rootLabel = '+', subForest = [Node {rootLabel = '-', subForest = [Node {rootLabel = 'b', subForest = []},Node {rootLabel = 'a', subForest = []}]},Node {rootLabel = 'c', subForest = []}]},Node {rootLabel = 'b', subForest = []}]}]}]}

ghci> see $ karvaToTree "Q/a*+b-cbabaccbac"
      Q      
      |      
      /      
      |      
 ------      
/      \     
a      *     
       |     
       ----- 
      /     \
      +     b
      |      
     ----    
    /    \   
    -    c   
    |        
    --       
   /  \      
   b  a

它与我在其中找到示例的本教程中预期的输出相匹配：

Carefully designed immutability avoids unecessary updating

Immutable data structures are only an efficiency problem if they're constantly changing, or you build them up the wrong way. For example, continually appending more to the end of a growing list is quadratic, whereas concatenating a list of lists is linear. If you think carefully, you can usually build up your structure in a sensible way, and lazy evaluation is your friend - hand out a promise to work it out and stop worrying.

Blindly trying to replicate an imperative algorithm can be ineffecient, but you're mistaken in your assertion that functional programming has to be asymptotically bad here.

Case study: pure functional GEP: Karva notation in linear time

I'll stick with your case study of parsing Karva notation for GEP. (
I've played with this solution more fully in this answer.)

Here's a fairly clean pure functional solution to the problem. I'll take the opportunity to name drop some good general recursion schemes along the way.

Code

(Importing Data.Tree supplies data Tree a = Node {rootLabel :: a, subForest :: Forest a} where type Forest a = [Tree a].)

import Data.Tree
import Data.Tree.Pretty -- from the pretty-tree package for visualising trees

arity :: Char -> Int
arity c 
  | c `elem` "+*-/" = 2
  | c `elem` "Q" = 1
  | otherwise = 0

A hylomorphism is the composition of an anamorphism (build up, unfoldr) and a catamorphism (combine, foldr).
These terms are introduced to the FP community in the seminal paper Functional Programming with Bananas, Lenses and Barbed wire.

We're going to pull the levels out (ana/unfold) and combine them back together (cata/fold).

hylomorphism :: b -> (a -> b -> b) -> (c -> (a, c)) -> (c -> Bool) -> c -> b
hylomorphism base combine pullout stop seed = hylo seed where
 hylo s | stop s = base
        | otherwise = combine new (hylo s') 
          where (new,s') = pullout s

To pull out a level, we use the total arity from the previous level to find where to split off this new level, and pass on the total arity for this one ready for next time:

pullLevel :: (Int,String) -> (String,(Int,String))
pullLevel (n,cs) = (level,(total, cs')) where
                   (level,        cs') = splitAt n cs
                   total = sum $ map arity level

To combine a level (as a String) with the level below (that's already a Forest), we just pull off the number of trees that each character needs.

combineLevel :: String -> Forest Char -> Forest Char
combineLevel "" [] = []
combineLevel (c:cs) levelBelow = Node c subforest : combineLevel cs theRest 
      where (subforest,theRest) = splitAt (arity c) levelBelow

Now we can parse the Karva using a hylomorphism. Note that we seed it with a total arity from outside the string of 1, since there's only one node at the root level. Correspondingly we apply head to the result to get this singleton back out after the hylomorphism.

karvaToTree :: String -> Tree Char
karvaToTree cs = let
  zero (n,_) = n == 0          
    in head $ hylomorphism [] combineLevel pullLevel zero (1,cs)

Linear Time

There's no exponential blowup, nor repeated O(log(n)) lookups or expensive modifications, so we shouldn't be in too much trouble.

arity is O(1)
splitAt part is O(part)
pullLevel (part,cs) is O(part) for grab using splitAt to get level, plus O(part) for the map arity level, so O(part)
combineLevel (c:cs) is O(arity c) for the splitAt, and O(sum $ map arity cs) for the recursive call
hylomorphism [] combineLevel pullLevel zero (1,cs)
- makes a pullLevel call for each level, so the total pullLevel cost is O(sum parts) = O(n)
- makes a combineLevel call for each level, so the total combineLevel cost is O(sum $ map arity levels) = O(n), since the total arity of the entire input is bound by n for valid strings.
- makes O(#levels) calls to zero (which is O(1)), and #levels is bound by n, so that's below O(n) too
Hence karvaToTree is linear in the length of the input.

I think that puts to rest the assertion that you needed to use mutability to get a linear algorithm here.

Demo

Let's have a draw of the results (because Tree is so full of syntax it's hard to read the output!). You have to cabal install pretty-tree to get Data.Tree.Pretty.

see :: Tree Char -> IO ()
see = putStrLn.drawVerticalTree.fmap (:"")

ghci> karvaToTree "Q/a*+b-cbabaccbac"
Node {rootLabel = 'Q', subForest = [Node {rootLabel = '/', subForest = [Node {rootLabel = 'a', subForest = []},Node {rootLabel = '*', subForest = [Node {rootLabel = '+', subForest = [Node {rootLabel = '-', subForest = [Node {rootLabel = 'b', subForest = []},Node {rootLabel = 'a', subForest = []}]},Node {rootLabel = 'c', subForest = []}]},Node {rootLabel = 'b', subForest = []}]}]}]}

ghci> see $ karvaToTree "Q/a*+b-cbabaccbac"
      Q      
      |      
      /      
      |      
 ------      
/      \     
a      *     
       |     
       ----- 
      /     \
      +     b
      |      
     ----    
    /    \   
    -    c   
    |        
    --       
   /  \      
   b  a

which matches the output expected from this tutorial where I found the example:

回复收藏 0 原文

荒人说梦 2024-12-04 06:00:33

没有单一的方法可以做到这一点，必须根据具体情况进行尝试。我通常尝试使用折叠和展开将它们分解为更简单的操作，然后从那里进行优化。正如其他人指出的那样，Karva 解码案例是广度优先的树展开，因此我从 treeUnfoldM_BF 开始。也许Erlang中也有类似的功能。

如果解码操作成本过高，您可以记住解码并共享/重用子树...尽管它可能不适合通用树展开器，并且您需要编写专门的函数来执行此操作。如果适应度函数足够慢，那么使用像我下面列出的这样的简单解码器可能没问题。每次调用都会完全重建树。

import Control.Monad.State.Lazy
import Data.Tree

type MaxArity = Int
type NodeType = Char

treeify :: MaxArity -> [Char] -> Tree NodeType
treeify maxArity (x:xs) = evalState (unfoldTreeM_BF (step maxArity) x) xs
treeify _ [] = fail "empty list"

step :: MaxArity -> NodeType -> State [Char] (NodeType, [NodeType])
step maxArity node = do
  xs <- get
  -- figure out the actual child node count and use it instead of maxArity
  let (children, ys) = splitAt maxArity xs
  put ys
  return (node, children)

main :: IO ()
main = do
 let x = treeify 3 "0138513580135135135"
 putStr $ drawTree . fmap (:[]) $ x
 return ()

There isn't a single way to do this, it really has to be attempted case-by-case. I typically try to break them down into simpler operations using fold and unfold and then optimize from there. Karva decoding case is a breadth-first tree unfold as others have noted, so I started with treeUnfoldM_BF. Perhaps there are similar functions in Erlang.

If the decoding operation is unreasonably expensive, you could memoize the decoding and share/reuse subtrees... though it probably wouldn't fit into a generic tree unfolder and you'd need to write specialized function to do so. If the fitness function is slow enough, it may be fine to use a naive decoder like the one I have listed below. It will fully rebuild the tree each invocation.

import Control.Monad.State.Lazy
import Data.Tree

type MaxArity = Int
type NodeType = Char

treeify :: MaxArity -> [Char] -> Tree NodeType
treeify maxArity (x:xs) = evalState (unfoldTreeM_BF (step maxArity) x) xs
treeify _ [] = fail "empty list"

step :: MaxArity -> NodeType -> State [Char] (NodeType, [NodeType])
step maxArity node = do
  xs <- get
  -- figure out the actual child node count and use it instead of maxArity
  let (children, ys) = splitAt maxArity xs
  put ys
  return (node, children)

main :: IO ()
main = do
 let x = treeify 3 "0138513580135135135"
 putStr $ drawTree . fmap (:[]) $ x
 return ()

回复收藏 0 原文

南烟 2024-12-04 06:00:33

当函数式编程中需要可变状态时，有几种解决方案。

使用不同的算法来解决相同的问题。例如，快速排序通常被认为是可变的，因此在功能设置中可能不太有用，但合并排序通常更适合功能设置。我无法判断此选项是否可行或在您的情况下是否有意义。
即使是函数式编程语言通常也提供某种改变状态的方法。（这篇博客帖子似乎展示了如何在 Erlang 中完成。）对于某些算法和数据结构，这确实是唯一可用的选择（我认为关于该主题的研究很活跃）；例如，函数式编程语言中的哈希表通常是用可变状态实现的。

就您而言，我不太确定不变性是否真的会导致性能瓶颈。你是对的，（子）树将在更新时重新创建，但 Erlang 实现可能会重用所有未更改的子树，导致每次更新的复杂度为 O(log n)，而不是可变状态的 O(1) 。此外，不会复制树的节点，而是复制对节点的引用，这应该相对有效。您可以在Okasaki 的论文中阅读有关功能设置中的树更新的信息< /a> 或在他基于论文的书《Purely Function Data Structures》中。我会尝试使用不可变的数据结构来实现该算法，如果遇到性能问题，则切换到可变的数据结构。

另请参阅此处和此处。

回复收藏 0 原文

蘑菇王子 2024-12-04 06:00:33

我想我已经弄清楚如何用 K 树解决您的特定问题（一般问题太难了：P）。我的解决方案以某种可怕的混合 Python 的伪代码形式呈现（我今天的 FP 速度非常慢）但是在创建一个节点后它不会更改节点（诀窍是构建树自下而上）

首先，我们需要找到哪些节点属于哪个级别：

levels currsize nodes = 
    this_level , rest = take currsize from nodes, whats left
    next_size = sum of the arities of the nodes
    return [this_level | levels next_size rest]
(initial currsize is 1)

因此，在 +/*abcd 中，例如，这应该为您提供 [+, /*, abcd]< /代码>。现在你可以将其转换为自下而上的树：

curr_trees = last level
for level in reverse(levels except the last)
    next_trees = []
    for root in level:
        n = arity of root
        trees, curr_trees = take n from curr_trees, whats left
        next_trees.append( Node(root, trees) )
    curr_trees = next_trees

curr_trees should be a list with the single root node now.

我很确定我们现在可以非常轻松地将其转换为单一赋值 Erlang/Haskell。

I think I figured out how to solve your particular problem with the K trees, (the general problem is too hard :P). My solution is presented in some horrible sort of hybrid Python-like psudocode (I am very slow on my FP today) but it doesn't change a node after you create one (the trick is building the tree bottom-up)

First, we need to find which nodes belong to which level:

levels currsize nodes = 
    this_level , rest = take currsize from nodes, whats left
    next_size = sum of the arities of the nodes
    return [this_level | levels next_size rest]
(initial currsize is 1)

So in the +/*abcd, example, this should give you [+, /*, abcd]. Now you can convert this into a tree bottom up:

curr_trees = last level
for level in reverse(levels except the last)
    next_trees = []
    for root in level:
        n = arity of root
        trees, curr_trees = take n from curr_trees, whats left
        next_trees.append( Node(root, trees) )
    curr_trees = next_trees

curr_trees should be a list with the single root node now.

I am pretty sure we can convert this into single assignment Erlang/Haskell very easily now.

回复收藏 0 原文

~没有更多了~

关于作者

掀纱窥君容

暂无简介

文章

27 人气

关注发私信

友情链接

文江博客

我是否可以始终将仅可变算法转换为单赋值并且仍然高效？

背景

我意识到 K 表达式翻译问题

The Context

The Question

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

精心设计的不变性避免了不必要的更新

案例研究：纯函数 GEP：线性时间中的 Karva 表示法

代码

线性时间

演示

Carefully designed immutability avoids unecessary updating

Case study: pure functional GEP: Karva notation in linear time

Code

Linear Time

Demo

关于作者

相关话题

热门标签

推荐作者

佚名

今天

゛时过境迁

达拉崩吧

呆萌少年

孤者何惧

友情链接

我是否可以始终将仅可变算法转换为单赋值并且仍然高效？

背景

我意识到 K 表达式翻译问题

The Context

The Question

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

精心设计的不变性避免了不必要的更新

案例研究：纯函数 GEP：线性时间中的 Karva 表示法

代码

线性时间

演示

Carefully designed immutability avoids unecessary updating

Case study: pure functional GEP: Karva notation in linear time

Code

Linear Time

Demo

关于作者

相关话题

热门标签

推荐作者

佚名

今天

゛时过境迁

达拉崩吧

呆萌少年

孤者何惧

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。