用递归方案修剪Alpha Beta

发布于 2025-02-04 18:02:12 字数 488 浏览 2 评论 0 原文

我正在尝试更精通递归方案,因为到目前为止,它们确实有助于将粗糙的显式递归代码转变为较少的尖峰。在实施可能真正令人困惑的显式递归混淆的算法时,我倾向于提供的其他工具之一是Monad Transformers / Mutability。理想情况下,我想对递归方案感到足够的舒适感,以便我可以完全抛弃状态。我仍然可以触及的算法的一个示例是带有alpha beta修剪的minimax。我使用can态和minimax f-algebra( data minimaxf af = mmresult a | mmstate [f] bool )进行了正常的minimax,但是我不确定如何扩展此操作以进行alpha beta beta修剪。我认为也许我可以使用组织态性,或者也许有一些与ComOnads的自定义解决方案,但是我不知道如何使用任何一种技术来尝试解决方案。

除了具有递归方案的Alpha Beta修剪版本外,您对解决类似问题的任何一般建议都将不胜感激。例如,我很难将递归方案应用于通常以命令式实施的Dijkstra等算法。

I'm trying to get more proficient with recursion schemes as they have so far been really helpful for turning gnarly explicit recursion code into something less spike-y. One of the other tools I tend to reach for when implementing algorithms that can get really confusing with explicit recursion is monad transformers / mutability. Ideally I'd like to get comfortable enough with recursion schemes such that I can ditch statefulness altogether. An example of an algorithm I'd still reach for the transformers for is minimax with alpha beta pruning. I did normal minimax with a catamorphism and minimax f-algebra (data MinimaxF a f = MMResult a | MMState [f] Bool), but I wasn't sure how I could extend this to do alpha beta pruning. I thought maybe I could use histomorphism, or maybe there was some custom solution with comonads, but I didn't know how to approach trying a solution using either technique.

In addition to a version of alpha beta pruning with recursion schemes any general advice you have about tackling similar problems would be much appreciated. For example I've had trouble applying recursion schemes to algorithms like Dijkstra that usually are implemented in an imperative fashion.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

烟雨凡馨 2025-02-11 18:02:12

alpha-beta可以看作是minimax的实例,其中 min max 是使用精心选择的晶格实例化的。 完整的GIST

我们将游戏表示为树,每个内部节点都是游戏中的位置,等待指定玩家选择移动到子节点,而每片叶子都是带有其得分或价值的最终位置。

-- | At every step, either the game ended with a value/score,
-- or one of the players is to play.
data GameF a r = Value a | Play Player (NonEmpty r)
 deriving Functor
type Game a = Fix (GameF a)

-- | One player wants to maximize the score,
-- the other wants to minimize the score.
data Player = Mini | Maxi

minimax 将在以下类定义的任何晶格上工作:

class Lattice l where
  inf, sup :: l -> l -> l

lattice 类比 ord> ord> ord> ord :和代码>实例是 lattice ,具有可决定性等效的( eq )。如果我们可以重新定义 ORD ,那么将 lattice 作为超类是适当的。但是在这里,newtype必须做:

-- The Lattice induced by an Ord
newtype Order a = Order { unOrder :: a }
  deriving (Eq, Ord)

instance Ord a => Lattice (Order a) where
  inf = min
  sup = max

这是minimax。它是通过嵌入 leaf :: a - &gt进行参数化的。 l 最终值为所选晶格。一个玩家最大化嵌入式值,另一个玩家将其最小化。

-- | Generalized minimax
gminimax :: Lattice l => (a -> l) -> Game a -> l
gminimax leaf = cata minimaxF where
  minimaxF (Value x) = leaf x
  minimaxF (Play p xs) = foldr1 (lopti p) xs

lopti :: Lattice l => Player -> l -> l -> l
lopti Mini = inf
lopti Maxi = sup

“常规” minimax将游戏的分数直接用作晶格:

minimax :: Ord a => Game a -> a
minimax = unOrder . gminimax Order

对于Alpha-Beta修剪,其想法是我们可以跟踪最佳分数的某些界限,这使我们可以在搜索中缩短搜索。因此,搜索将通过该间隔(Alpha,beta)进行参数化。这使我们进入了函数的晶格间隔a - >

newtype Pruning a = Pruning { unPruning :: Interval a -> a }

一个间隔可以由表示(也许是A,也许是A),以允许双方无限。但是,我们将使用更好的命名类型来清晰,并且在每一侧都利用不同的 ord 实例:

type Interval a = (WithBot a, WithTop a)
data WithBot a = Bot | NoBot a deriving (Eq, Ord)
data WithTop a = NoTop a | Top deriving (Eq, Ord)

我们将要求我们只能构造修剪f 如果 f 满足夹具i(fi)=夹具i(f(bot,top)),其中夹具在下面定义。这样一来, f 是一种搜索算法,如果它得知其结果在间隔之外,则可以将其折叠起来,而无需找到确切的结果。

clamp :: Ord a => Interval a -> a -> a
clamp (l, r) = clampBot l . clampTop r

clampBot :: Ord a => WithBot a -> a -> a
clampBot Bot x = x
clampBot (NoBot y) x = max y x

clampTop :: Ord a => WithTop a -> a -> a
clampTop Top x = x
clampTop (NoTop y) x = min y x

函数通过尖端提起形成晶格。而且,当我们仅考虑满足夹具i(fi)=夹具i(f(bot,top))的功能时>如果 clamp<*> f = clamp<*> g ),则可能是晶格的短路定义。

两个功能的 inf l r ,给定间隔 i =(alpha,beta),首先运行 l(Alpha,beta)获得值 vl
如果 vl< = alpha ,则必须是夹具i vl == alpha ==钳位i(min vl(ri)),因此我们可以停止并返回 vl 而无需查看 r 。否则,我们知道最终结果将不超过 vl ,因此我们还可以更新传递给 r r 。 >。 sup 对称定义。

instance Ord a => Lattice (Pruning a) where
  inf l r = Pruning \(alpha, beta) ->
    let vl = unPruning l (alpha, beta) in
    if NoBot vl <= alpha then vl else min vl (unPruning r (alpha, min (NoTop vl) beta))

  sup l r = Pruning \(alpha, beta) ->
    let vl = unPruning l (alpha, beta) in
    if beta <= NoTop vl then vl else max vl (unPruning r (max (NoBot vl) alpha, beta))

因此,我们获得了α-beta作为minimax的实例。一旦定义了上面的晶格,我们只需要一些简单的包装和解开即可。

alphabeta :: Ord a => Game a -> a
alphabeta = runPruning . gminimax constPruning

constPruning :: a -> Pruning a
constPruning = Pruning . const

runPruning :: Pruning a -> a
runPruning f = unPruning f (Bot, Top)

如果一切顺利, alphabeta minimax 应该具有相同的结果:

main :: IO ()
main = quickCheck \g -> minimax g === alphabeta (g :: Game Int)

Alpha-beta can be seen as an instance of minimax, where min and max are instantiated using a well-chosen lattice. Full gist.

We represent games as a tree, where each internal node is a position in the game, waiting for a designated player to pick a move to a child node, and each leaf is a final position with its score, or value.

-- | At every step, either the game ended with a value/score,
-- or one of the players is to play.
data GameF a r = Value a | Play Player (NonEmpty r)
 deriving Functor
type Game a = Fix (GameF a)

-- | One player wants to maximize the score,
-- the other wants to minimize the score.
data Player = Mini | Maxi

minimax will work on any lattice, defined by the following class:

class Lattice l where
  inf, sup :: l -> l -> l

The Lattice class is more general than Ord: and Ord instance is a Lattice with decidable equality (Eq). If we could redefine Ord, then it would be appropriate to add Lattice as a superclass. But here a newtype will have to do:

-- The Lattice induced by an Ord
newtype Order a = Order { unOrder :: a }
  deriving (Eq, Ord)

instance Ord a => Lattice (Order a) where
  inf = min
  sup = max

Here's minimax. It is parameterized by an embedding leaf :: a -> l of final values to the chosen lattice. One player maximizes the embedded value, the other player minimizes it.

-- | Generalized minimax
gminimax :: Lattice l => (a -> l) -> Game a -> l
gminimax leaf = cata minimaxF where
  minimaxF (Value x) = leaf x
  minimaxF (Play p xs) = foldr1 (lopti p) xs

lopti :: Lattice l => Player -> l -> l -> l
lopti Mini = inf
lopti Maxi = sup

The "regular" minimax uses the scores of the game directly as the lattice:

minimax :: Ord a => Game a -> a
minimax = unOrder . gminimax Order

For alpha-beta pruning, the idea is that we can keep track of some bounds on the optimal score, and this allows us to short-circuit the search. So the search is to be parameterized by that interval (alpha, beta). This leads us to a lattice of functions Interval a -> a:

newtype Pruning a = Pruning { unPruning :: Interval a -> a }

An interval can be represented by (Maybe a, Maybe a) to allow either side to be unbounded. But we shall use better named types for clarity, and also to leverage a different Ord instance on each side:

type Interval a = (WithBot a, WithTop a)
data WithBot a = Bot | NoBot a deriving (Eq, Ord)
data WithTop a = NoTop a | Top deriving (Eq, Ord)

We will require that we can only construct Pruning f if f satisfies clamp i (f i) = clamp i (f (Bot, Top)), where clamp is defined below. That way, f is a search algorithm which may shortcircuit if it learns that its result lies outside of the interval, without having to find the exact result.

clamp :: Ord a => Interval a -> a -> a
clamp (l, r) = clampBot l . clampTop r

clampBot :: Ord a => WithBot a -> a -> a
clampBot Bot x = x
clampBot (NoBot y) x = max y x

clampTop :: Ord a => WithTop a -> a -> a
clampTop Top x = x
clampTop (NoTop y) x = min y x

Functions form a lattice by pointwise lifting. And when we consider only functions satisfying clamp i (f i) = clamp i (f (Bot, Top)) and equate them modulo a suitable equivalence relation (Pruning f = Pruning g if clamp <*> f = clamp <*> g), a short-circuiting definition of the lattice becomes possible.

The inf of two functions l and r, given an interval i = (alpha, beta), first runs l (alpha, beta) to obtain a value vl.
If vl <= alpha, then it must be clamp i vl == alpha == clamp i (min vl (r i)) so we can stop and return vl without looking at r. Otherwise, we run r, knowing that the final result is not going to be more than vl so we can also update the upper bound passed to r. sup is defined symmetrically.

instance Ord a => Lattice (Pruning a) where
  inf l r = Pruning \(alpha, beta) ->
    let vl = unPruning l (alpha, beta) in
    if NoBot vl <= alpha then vl else min vl (unPruning r (alpha, min (NoTop vl) beta))

  sup l r = Pruning \(alpha, beta) ->
    let vl = unPruning l (alpha, beta) in
    if beta <= NoTop vl then vl else max vl (unPruning r (max (NoBot vl) alpha, beta))

Thus we obtain alpha-beta as an instance of minimax. Once the lattice above is defined, we only need some simple wrapping and unwrapping.

alphabeta :: Ord a => Game a -> a
alphabeta = runPruning . gminimax constPruning

constPruning :: a -> Pruning a
constPruning = Pruning . const

runPruning :: Pruning a -> a
runPruning f = unPruning f (Bot, Top)

If all goes well, alphabeta and minimax should have the same result:

main :: IO ()
main = quickCheck \g -> minimax g === alphabeta (g :: Game Int)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文