Minimax算法:成本/评估函数?
一个学校项目让我用 C++ 编写一个日期游戏(示例位于 http:// /www.cut-the-knot.org/Curriculum/Games/Date.shtml),其中计算机玩家必须实现具有 alpha-beta 剪枝的 Minimax 算法。到目前为止,我了解算法背后的目标是最大化潜在收益,同时假设对手会最小化它们。
然而,我阅读的所有资源都没有帮助我理解如何设计极小极大模型所有决策所依据的评估函数。所有示例都为叶节点分配了任意数字,但是,我实际上需要为这些节点分配有意义的值。
直觉告诉我,胜利叶节点的值类似于+1,失败的叶节点为-1,但是中间节点如何评估?
任何帮助将不胜感激。
A school project has me writing a Date game in C++ (example at http://www.cut-the-knot.org/Curriculum/Games/Date.shtml) where the computer player must implement a Minimax algorithm with alpha-beta pruning. Thus far, I understand what the goal is behind the algorithm in terms of maximizing potential gains while assuming the opponent will minify them.
However, none of the resources I read helped me understand how to design the evaluation function the minimax bases all it's decisions on. All the examples have had arbitrary numbers assigned to the leaf nodes, however, I need to actually assign meaningful values to those nodes.
Intuition tells me it'd be something like +1 for a win leaf node, and -1 for a loss, but how do intermediate nodes evaluate?
Any help would be most appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
最基本的极小极大仅评估叶节点,标记获胜、失败和平局,并将这些值返回树以确定中间节点值。 如果博弈树难以处理,您需要使用截止深度作为极小极大函数的附加参数。一旦达到深度,您需要对不完整状态运行某种评估函数。
极小极大搜索中的大多数评估函数都是特定领域的,因此为您的特定游戏找到帮助可能很困难。只需记住,评估需要返回特定玩家获胜位置的某种百分比期望(通常是最大值,但在使用 negamax 实现时不是)。 几乎任何研究较少的游戏都将与另一个研究较多的游戏非常相似。 这与拾取棒游戏密切相关。 仅使用极小极大和 alpha beta,我猜这个游戏很容易处理。
如果您必须为非终端位置创建一个评估函数,这里对分析棍子游戏有一些帮助,您可以决定它是否对约会游戏有用。
通过查看最终位置以及可能导致该位置的所有动作,开始寻找一种强制结果的方法。 在棍子游戏中,最终位置是指最后一步剩余 3 根或更少的棍子。 因此,紧接着该最终位置的位置将给对手留下 4 根棍子。现在的目标是无论如何都让你的对手留下 4 根棍子,这可以通过留给你 5、6 或 7 根棍子来完成,你想强迫你的对手让你处于这些位置之一。为了让你处于 5、6 或 7 中,你的对手需要处于的位置是 8。 不断地继续这个逻辑,很快就会出现一个模式。 总是给你的对手留下一个能被 4 整除的数字,你就赢了,其他的,你就输了。
这是一个相当简单的游戏,但确定启发式的方法很重要,因为它可以直接应用于您的作业。 由于最后一步先走,并且您一次只能更改 1 个日期属性,因此您知道要获胜,需要正好剩下 2 个步...等等。
祝你好运,让我们知道你最终会做什么。
The most basic minimax evaluates only leaf nodes, marking wins, losses and draws, and backs those values up the tree to determine the intermediate node values. In the case that the game tree is intractable, you need to use a cutoff depth as an additional parameter to your minimax functions. Once the depth is reached, you need to run some kind of evaluation function for incomplete states.
Most evaluation functions in a minimax search are domain specific, so finding help for your particular game can be difficult. Just remember that the evaluation needs to return some kind of percentage expectation of the position being a win for a specific player (typically max, though not when using a negamax implementation). Just about any less researched game is going to closely resemble another more researched game. This one ties in very closely with the game pickup sticks. Using minimax and alpha beta only, I would guess the game is tractable.
If you are must create an evaluation function for non terminal positions, here is a little help with the analysis of the sticks game, which you can decide if its useful for the date game or not.
Start looking for a way to force an outcome by looking at a terminal position and all the moves which can lead to that position. In the sticks game, a terminal position is with 3 or fewer sticks remaining on the last move. The position that immediately proceeds that terminal position is therefore leaving 4 sticks to your opponent. The goal is now leave your opponent with 4 sticks no matter what, and that can be done from either 5, 6 or 7 sticks being left to you, and you would like to force your opponent to leave you in one of those positions. The place your opponent needs to be in order for you to be in either 5, 6 or 7 is 8. Continue this logic on and on and a pattern becomes available very quickly. Always leave your opponent with a number divisible by 4 and you win, anything else, you lose.
This is a rather trivial game, but the method for determining the heuristic is what is important because it can be directly applied to your assignment. Since the last to move goes first, and you can only change 1 date attribute at a time, you know to win there needs to be exactly 2 moves left... and so on.
Best of luck, let us know what you end up doing.
评估函数最简单的情况是 +1 表示获胜,-1 表示失败,0 表示任何未完成的位置。如果你的树足够深,即使这个简单的函数也会给你一个好的玩家。对于任何具有高分支因子的重要游戏,通常您需要一个更好的函数,并带有一些启发式方法(例如,对于国际象棋,您可以为棋子分配权重并求和等)。对于日期游戏,我只会使用最简单的评估函数,所有中间节点都为 0。
顺便说一句,极小极大算法并不是这个特定游戏的最佳算法。但我想你已经知道了。
The simplest case of an evaluation function is +1 for a win, -1 for a loss and 0 for any non-finished position. Given your tree is deep enough, even this simple function will give you a good player. For any non-trivial games, with high branching factor, typically you need a better function, with some heuristics (e.g. for chess you could assign weights to pieces and find a sum, etc.). In the case of the Date game, I would just use the simplest evaluation function, with 0 for all the intermediate nodes.
As a side note, minimax is not the best algorithm for this particular game; but I guess you know it already.
根据我对您链接的约会游戏的理解,似乎玩家唯一可能的结果是赢或输,没有介于两者之间的结果(如果我错了,请纠正我)。
在这种情况下,只需将值 1 分配给获胜位置(当前玩家到达 12 月 31 日),将值 -1 分配给失败位置(其他玩家到达 12 月 31 日)。
你的极小极大算法(没有 alpha-beta 剪枝)看起来像这样:
From what I understand of the Date game you linked to, it seems that the only possible outcomes for a player are win or lose, there is not in between (please correct me if I'm wrong).
In this case, it is only a matter of assigning a value of 1 to a winning position (current player gets to Dec 31) and a value of -1 to the losing positions (other player gets to Dec 31).
Your minimax algorithm (without alpha-beta pruning) would look something like this: