如何编写国际象棋神经网络编程?

发布于 2024-07-17 07:43:08 字数 590 浏览 13 评论 0原文

我想编写一个国际象棋引擎,它可以学习如何走好棋并战胜其他玩家。 我已经编写了棋盘的表示形式和输出所有可能走法的函数。 所以我只需要一个评估函数来说明董事会的给定情况有多好。 因此,我想使用人工神经网络来评估给定的位置。 输出应该是一个数值。 值越高,白人球员的位置越好。

我的方法是构建一个由 385 个神经元组成的网络:棋盘上有 6 个独特的棋子和 64 个区域。 因此,对于每个字段,我们采用 6 个神经元(每块 1 个)。 如果有白色棋子,则输入值为1。如果有黑色棋子,则输入值为-1。 如果该字段上没有此类块,则值为 0。除此之外,还应该有 1 个神经元供玩家移动。 如果轮到白方,则输入值为 1;如果轮到黑方,则输入值为 -1。

我认为神经网络的配置相当不错。 但缺少主要部分:如何将这个神经网络实现为编码语言(例如Delphi)? 我认为每个神经元的权重一开始应该是相同的。 根据比赛结果,应调整权重。 但如何呢? 我想我应该让 2 个电脑玩家(都使用我的引擎)互相对战。 如果白方获胜,黑方会收到其权重不佳的反馈。

因此,如果您能帮助我将神经网络实现为一种编码语言(最好是 Delphi,否则是伪代码),那就太好了。 提前致谢!

I want to program a chess engine which learns to make good moves and win against other players. I've already coded a representation of the chess board and a function which outputs all possible moves. So I only need an evaluation function which says how good a given situation of the board is. Therefore, I would like to use an artificial neural network which should then evaluate a given position. The output should be a numerical value. The higher the value is, the better is the position for the white player.

My approach is to build a network of 385 neurons: There are six unique chess pieces and 64 fields on the board. So for every field we take 6 neurons (1 for every piece). If there is a white piece, the input value is 1. If there is a black piece, the value is -1. And if there is no piece of that sort on that field, the value is 0. In addition to that there should be 1 neuron for the player to move. If it is White's turn, the input value is 1 and if it's Black's turn, the value is -1.

I think that configuration of the neural network is quite good. But the main part is missing: How can I implement this neural network into a coding language (e.g. Delphi)? I think the weights for each neuron should be the same in the beginning. Depending on the result of a match, the weights should then be adjusted. But how? I think I should let 2 computer players (both using my engine) play against each other. If White wins, Black gets the feedback that its weights aren't good.

So it would be great if you could help me implementing the neural network into a coding language (best would be Delphi, otherwise pseudo-code). Thanks in advance!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

淡写薰衣草的香 2024-07-24 07:43:08

万一有人偶然发现此页面。 鉴于我们现在所知道的情况,OP 的提议几乎肯定是可能的。 事实上,我们成功地为一个具有更大状态空间的游戏做到了这一点 - Go ( https://deepmind.com/research/case-studies/alphago-the-story-so-far)。

In case somebody randomly finds this page. Given what we know now, what the OP proposes is almost certainly possible. In fact we managed to do it for a game with much larger state space - Go ( https://deepmind.com/research/case-studies/alphago-the-story-so-far ).

-残月青衣踏尘吟 2024-07-24 07:43:08

如果您还使用 alpha-beta 剪枝进行一些经典的最小最大前瞻,我不明白为什么您不能为静态求值器建立神经网络。 许多国际象棋引擎都使用极小极大值和一个脑残的静态评估器,该评估器只是将棋子或其他东西相加; 如果你有足够的极小极大级别,那么这并不重要。 我不知道网络会带来多大的改进,但不会有什么损失。 但训练它会很棘手。 我建议使用一个能够预测许多动作(并占用 CPU 负载等)的引擎来训练评估器,使其能够预测更少的动作。 这样你最终会得到一个不占用太多 CPU 的引擎(希望如此)。

编辑:我在 2010 年写了上面的内容,现在在 2020 年 Stockfish NNUE 已经做到了< /a>. “该网络针对中等搜索深度的数百万个位置的[经典 Stockfish] 评估进行了优化和训练”,然后用作静态评估器,在最初的测试中,当使用此静态评估器而不是他们的前一个(或者,同等地,相同的 elo,但 CPU 时间少一点)。 所以是的,它确实有效,您甚至不必像我最初建议的那样以高搜索深度训练网络:适度的搜索深度就足够了,但关键是使用数百万个位置。

I don't see why you can't have a neural net for a static evaluator if you also do some classic mini-max lookahead with alpha-beta pruning. Lots of Chess engines use minimax with a braindead static evaluator that just adds up the pieces or something; it doesn't matter so much if you have enough levels of minimax. I don't know how much of an improvement the net would make but there's little to lose. Training it would be tricky though. I'd suggest using an engine that looks ahead many moves (and takes loads of CPU etc) to train the evaluator for an engine that looks ahead fewer moves. That way you end up with an engine that doesn't take as much CPU (hopefully).

Edit: I wrote the above in 2010, and now in 2020 Stockfish NNUE has done it. "The network is optimized and trained on the [classical Stockfish] evaluations of millions of positions at moderate search depth" and then used as a static evaluator, and in their initial tests they got an 80-elo improvement when using this static evaluator instead of their previous one (or, equivalently, the same elo with a little less CPU time). So yes it does work, and you don't even have to train the network at high search depth as I originally suggested: moderate search depth is enough, but the key is to use many millions of positions.

无语# 2024-07-24 07:43:08

去过也做过。 由于您的问题不存在连续性(一个位置的值与另一个位置的值并不密切相关,一个输入的值仅发生 1 次变化),因此神经网络发挥作用的可能性很小。 在我的实验中从未出现过这种情况。

我宁愿看到一个带有临时启发式(其中有很多)的模拟退火系统来评估位置的价值...

但是,如果您打算使用神经网络,则相对容易表示。 一般的神经网络只是一个图,每个节点都是一个神经元。 每个神经元都有一个当前激活值和一个转换公式,用于根据输入值(即与其有链接的所有节点的激活值)计算下一个激活值。

一个更经典的神经网络,即具有输入层、输出层、每层相同的神经元并且没有时间依赖性,因此可以由输入节点数组、输出节点数组和链接图来表示连接这些的节点。 每个节点都拥有一个当前激活值以及它转发到的节点列表。 计算输出值只是将输入神经元的激活设置为输入值,然后依次迭代每个后续层,使用转移公式计算前一层的激活值。 当到达最后(输出)层时,您就得到了结果。

Been there, done that. Since there is no continuity in your problem (the value of a position is not closely related to an other position with only 1 change in the value of one input), there is very little chance a NN would work. And it never did in my experiments.

I would rather see a simulated annealing system with an ad-hoc heuristic (of which there are plenty out there) to evaluate the value of the position...

However, if you are set on using a NN, is is relatively easy to represent. A general NN is simply a graph, with each node being a neuron. Each neuron has a current activation value, and a transition formula to compute the next activation value, based on input values, i.e. activation values of all the nodes that have a link to it.

A more classical NN, that is with an input layer, an output layer, identical neurons for each layer, and no time-dependency, can thus be represented by an array of input nodes, an array of output nodes, and a linked graph of nodes connecting those. Each node possesses a current activation value, and a list of nodes it forwards to. Computing the output value is simply setting the activations of the input neurons to the input values, and iterating through each subsequent layer in turn, computing the activation values from the previous layer using the transition formula. When you have reached the last (output) layer, you have your result.

又爬满兰若 2024-07-24 07:43:08

这是可能的,但无论如何都不是微不足道的。

https://erikbern.com/2014/11/29/deep- Learning-for-chess/

为了训练他的评估函数,他使用了大量的计算能力。

一般来说,您可以按如下方式进行处理。 您的评估函数是前馈神经网络。 让矩阵计算产生标量输出,评估移动的好坏。 网络的输入向量是棋盘上所有棋子所代表的棋盘状态,例如,白棋子为 1,白棋为 2...,空白为 0。棋盘状态输入向量示例就是 0 的序列-12的。 对于许多游戏,可以使用大师游戏(例如可在小说数据库中获得)来训练这种评估,从而最大限度地减少当前参数所说的最高估值与大师采取的行动(应该具有最高估值)之间的损失。 当然,这是假设大师的走法是正确且最优的。

It is possible, but not trivial by any means.

https://erikbern.com/2014/11/29/deep-learning-for-chess/

To train his evaluation function, he utilized a lot of computing power to do so.

To summarize generally, you could go about it as follows. Your evaluation function is a feedforward NN. Let the matrix computations lead to a scalar output valuing how good the move is. The input vector for the network is the board state represented by all the pieces on the board so say white pawn is 1, white knight is 2... and empty space is 0. An example board state input vector is simply a sequence of 0-12's. This evaluation can be trained using grandmaster games (available at a fics database for example) for many games, minimizing loss between what the current parameters say is the highest valuation and what move the grandmasters made (which should have the highest valuation). This of course assumes that the grandmaster moves are correct and optimal.

你与昨日 2024-07-24 07:43:08

训练人工神经网络需要的要么是反向传播学习,要么是某种形式的遗传算法。 但国际象棋是一种如此复杂的游戏,简单的人工神经网络不太可能学会下棋——如果学习过程是无人监督的,情况就更是如此。

此外,你的问题没有说明层数。 您想要使用 385 个输入神经元来对当前情况进行编码。 但你想如何决定做什么呢? 每个字段的神经元? 最高兴奋度获胜? 但往往有不止一种可能的举动。

此外,您将需要多个隐藏层 - 可以用没有隐藏层的输入和输出层表示的功能实际上是有限的。

所以我不想阻止你尝试,但在一年左右的时间内成功实施和培训的机会几乎为零。

当我 16 岁左右的时候,我尝试构建并训练一个 ANN 来玩井字游戏……但我失败了。 我建议先尝试一下这样一个简单的游戏。

What you need to train a ANN is either something like backpropagation learning or some form of a genetic algorithm. But chess is such an complex game that it is unlikly that a simple ANN will learn to play it - even more if the learning process is unsupervised.

Further, your question does not say anything about the number of layers. You want to use 385 input neurons to encode the current situation. But how do you want to decide what to do? On neuron per field? Highest excitation wins? But there is often more than one possible move.

Further you will need several hidden layers - the functions that can be represented with an input and an output layer without hidden layer are really limited.

So I do not want to prevent you from trying it, but chances for a successful implemenation and training within say one year or so a practically zero.

I tried to build and train an ANN to play Tic-tac-toe when I was 16 years or so ... and I failed. I would suggest to try such an simple game first.

伊面 2024-07-24 07:43:08

我在这里看到的主要问题是培训问题之一。 你说你希望你的人工神经网络占据当前的棋盘位置并评估它对玩家来说有多好。 (我假设你将为玩家采取每一种可能的行动,将其应用于当前的棋盘状态,通过人工神经网络进行评估,然后采取输出最高的行动 - 即:爬山)

我认为你的选择是:

  • 开发一些启发式函数来评估棋盘状态并据此训练网络。 但这引出了一个问题:当您可以只使用启发式方法时,为什么还要使用人工神经网络。

  • 使用一些统计指标,例如“在这种棋盘配置下,白人或黑人赢得了多少场比赛?”,这将为您提供白人或黑人之间的适合度值。 这样做的困难在于问题空间大小所需的训练数据量。

使用第二个选项,您始终可以向其提供大师游戏中的棋盘序列,并希望有足够的覆盖范围供 ANN 开发解决方案。

由于问题的复杂性,我希望在不降低训练速度太多的情况下,尽可能使用最大的网络(即:大量内部节点)。

The main problem I see here is one of training. You say you want your ANN to take the current board position and evaluate how good it is for a player. (I assume you will take every possible move for a player, apply it to the current board state, evaluate via the ANN and then take the one with the highest output - ie: hill climbing)

Your options as I see them are:

  • Develop some heuristic function to evaluate the board state and train the network off that. But that begs the question of why use an ANN at all, when you could just use your heuristic.

  • Use some statistical measure such as "How many games were won by white or black from this board configuration?", which would give you a fitness value between white or black. The difficulty with that is the amount of training data required for the size of your problem space.

With the second option you could always feed it board sequences from grandmaster games and hope there is enough coverage for the ANN to develop a solution.

Due to the complexity of the problem I'd want to throw the largest network (ie: lots of internal nodes) at it as I could without slowing down the training too much.

冰雪梦之恋 2024-07-24 07:43:08

你的输入算法是健全的 - 所有位置、所有棋子和两名球员都被考虑在内。 您可能需要为游戏板的每个过去状态提供一个输入层,以便再次使用过去的事件作为输入。

输出层应该(以某种形式)给出要移动的棋子以及要移动到的位置。

使用包含所有神经元权重和突触强度的连接组编写遗传算法,并开始多个独立的基因池,每个基因池中都有大量连接组。

让它们互相比赛,保留最好的一组,交叉并变异最好的连接体以重新填充池。

Your input algorithm is sound - all positions, all pieces, and both players are accounted for. You may need an input layer for every past state of the gameboard, so that past events are used as input again.

The output layer should (in some form) give the piece to move, and the location to move to.

Write a genetic algorithm using a connectome which contains all neuron weights and synapse strengths, and begin multiple separated gene pools with a large number of connectomes in each.

Make them play one another, keep the best handful, crossover and mutate the best connectomes to repopulate the pool.

泪痕残 2024-07-24 07:43:08

来这里是为了说出塞拉斯所说的话。 使用极小极大算法,您可以预期能够预测 N 步棋。 使用 Alpha-beta 剪枝,您可以将其扩展到理论上 2*N 次移动,但更实际的是 3*N/4 次移动。 神经网络在这里非常合适。

也许可以使用遗传算法。

Came here to say what Silas said. Using a minimax algorithm, you can expect to be able to look ahead N moves. Using Alpha-beta pruning, you can expand that to theoretically 2*N moves, but more realistically 3*N/4 moves. Neural networks are really appropriate here.

Perhaps though a genetic algorithm could be used.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文