连接 4 与神经网络:草案评估 +进一步的步骤

发布于 2024-10-09 13:16:28 字数 544 浏览 0 评论 0原文

我想构建一个使用人工神经网络工作的 Connect 4 引擎 - 只是因为我对 ANN 着迷。

我创建了以下 ANN 结构草案。会起作用吗?这些连接是否正确(即使是交叉的)?

alt text

你能帮我为这个 ANN 起草一个 UML 类图吗?

我想向 ANN 提供董事会代表作为其输入。输出应该是选择的动作。

稍后应使用强化学习sigmoid 函数 应该被应用。该引擎将与人类玩家对战。并且根据比赛结果,调整权重。

我正在寻找...

...主要是编码问题。从抽象思维到编码的距离越远,效果就越好。

I would like to build a Connect 4 engine which works using an artificial neural network - just because I'm fascinated by ANNs.

I'be created the following draft of the ANN structure. Would it work? And are these connections right (even the cross ones)?

alt text

Could you help me to draft up an UML class diagram for this ANN?

I want to give the board representation to the ANN as its input. And the output should be the move to chose.

The learning should later be done using reinforcement learning and the sigmoid function should be applied. The engine will play against human players. And depending on the result of the game, the weights should be adjusted then.

What I'm looking for ...

... is mainly coding issues. The more it goes away from abstract thinking to coding - the better it is.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

八巷 2024-10-16 13:16:28

下面是我在搞乱神经网络时如何组织设计和代码的。这里的代码(显然)是伪代码,大致遵循面向对象的约定。

从下往上开始,你将拥有你的神经元。每个神经元需要能够保存它对传入连接的权重、保存传入连接数据的缓冲区以及其传出边缘的列表。每个神经元需要能够做三件事:

  • 一种从传入边缘接受数据
  • 的方法 一种处理输入数据和权重以制定该神经元将发送出去的值的
  • 方法 一种在传出边缘发送该神经元值的方法就

代码而言,这可以翻译为:

// Each neuron needs to keep track of this data
float in_data[]; // Values sent to this neuron
float weights[]; // The weights on each edge
float value; // The value this neuron will be sending out
Neuron out_edges[]; // Each Neuron that this neuron should send data to

// Each neuron should expose this functionality
void accept_data( float data ) {
    in_data.append(data); // Add the data to the incoming data buffer
}
void process() {
    value = /* result of combining weights and incoming data here */;
}
void send_value() {
    foreach ( neuron in out_edges ) {
        neuron.accept_data( value );
    }
}

接下来,我发现如果创建一个包含神经元列表的 Layer 类,则最简单。 (很有可能跳过这个类,只让你的 NeuralNetwork 保存一个神经元列表。我发现使用 Layer 类在组织和调试方面更容易。)每个层都应该具有以下能力

  • :每个神经元“激发”
  • 返回该层所包围的神经元的原始数组。 (当您需要执行诸如在神经网络的第一层中手动填充输入数据之类的操作时,这非常有用。)

从代码角度来看,这意味着:

//Each layer needs to keep track of this data.
Neuron[] neurons;

//Each layer should expose this functionality.
void fire() {
    foreach ( neuron in neurons ) {
        float value = neuron.process();
        neuron.send_value( value );
    }
}
Neuron[] get_neurons() {
    return neurons;
}

最后,您有一个包含层列表的 NeuralNetwork 类,这是一种使用初始数据、学习算法和运行整个神经网络的方法设置第一层。在我的实现中,我通过添加由单个假神经元组成的第四层来收集最终的输出数据,该假神经元只是缓冲所有传入数据并返回该数据。

// Each neural network needs to keep track of this data.
Layer[] layers;

// Each neural network should expose this functionality
void initialize( float[] input_data ) {
    foreach ( neuron in layers[0].get_neurons() ) {
        // do setup work here
    }
}
void learn() {
    foreach ( layer in layers ) {
        foreach ( neuron in layer ) {
            /* compare the neuron's computer value to the value it
             * should have generated and adjust the weights accordingly
             */
        }
    }
}
void run() {
    foreach (layer in layers) {
        layer.fire();
    }
}

我建议从向后传播作为学习算法开始,因为它被认为是最容易实现的。当我从事这项工作时,我很难找到该算法的非常简单的解释,但我的笔记列表 这个网站是一个很好的参考。

我希望这足以让您开始!

The below is how I organized my design and code when I was messing with neural networks. The code here is (obviously) psuedocode and roughly follows Object Oriented conventions.

Starting from the bottom up, you'll have your neuron. Each neuron needs to be able to hold the weights it puts on the incoming connections, a buffer to hold the incoming connection data, and a list of its outgoing edges. Each neuron needs to be able to do three things:

  • A way to accept data from an incoming edge
  • A method of processing the input data and weights to formulate the value this neuron will be sending out
  • A way of sending out this neuron's value on the outgoing edges

Code-wise this translates to:

// Each neuron needs to keep track of this data
float in_data[]; // Values sent to this neuron
float weights[]; // The weights on each edge
float value; // The value this neuron will be sending out
Neuron out_edges[]; // Each Neuron that this neuron should send data to

// Each neuron should expose this functionality
void accept_data( float data ) {
    in_data.append(data); // Add the data to the incoming data buffer
}
void process() {
    value = /* result of combining weights and incoming data here */;
}
void send_value() {
    foreach ( neuron in out_edges ) {
        neuron.accept_data( value );
    }
}

Next, I found it easiest if you make a Layer class which holds a list of neurons. (It's quite possible to skip over this class, and just have your NeuralNetwork hold a list of list of neurons. I found it to be easier organizationally and debugging-wise to have a Layer class.) Each layer should expose the ability to:

  • Cause each neuron to 'fire'
  • Return the raw array of neurons that this Layer wraps around. (This is useful when you need to do things like manually filling in input data in the first layer of a neural network.)

Code-wise this translates to:

//Each layer needs to keep track of this data.
Neuron[] neurons;

//Each layer should expose this functionality.
void fire() {
    foreach ( neuron in neurons ) {
        float value = neuron.process();
        neuron.send_value( value );
    }
}
Neuron[] get_neurons() {
    return neurons;
}

Finally, you have a NeuralNetwork class that holds a list of layers, a way of setting up the first layer with initial data, a learning algorithm, and a way to run the whole neural network. In my implementation, I collected the final output data by adding a fourth layer consisting of a single fake neuron that simply buffered all of its incoming data and returned that.

// Each neural network needs to keep track of this data.
Layer[] layers;

// Each neural network should expose this functionality
void initialize( float[] input_data ) {
    foreach ( neuron in layers[0].get_neurons() ) {
        // do setup work here
    }
}
void learn() {
    foreach ( layer in layers ) {
        foreach ( neuron in layer ) {
            /* compare the neuron's computer value to the value it
             * should have generated and adjust the weights accordingly
             */
        }
    }
}
void run() {
    foreach (layer in layers) {
        layer.fire();
    }
}

I recommend starting with Backwards Propagation as your learning algorithm as it's supposedly the easiest to implement. When I was working on this, I had great difficulty trying to find a very simple explanation of the algorithm, but my notes list this site as being a good reference.

I hope that's enough to get you started!

旧话新听 2024-10-16 13:16:28

实现神经网络的方法有很多种,从简单/易于理解到高度优化。您链接到的有关反向传播的维基百科文章包含C++, C#Java 等可以作为很好的参考,如果您有兴趣了解其他人是如何做到的。

一种简单的架构将节点和连接建模为单独的实体;节点将具有到其他节点的可能的传入和传出连接以及激活级别和错误值,而连接将具有权重值。

或者,还有更有效的方法来表示这些节点和连接——例如,按层组织的浮点值数组。这使得代码变得有点棘手,但避免了创建如此多的对象和指向对象的指针。

一个注意事项:人们通常会包含 偏置节点——除了正常的输入节点之外——它为每个隐藏节点和输出节点提供恒定值。

There are a lot of different ways to implement neural networks that range from simple/easy-to-understand to highly-optimized. The Wikipedia article on backpropagation that you linked to has links to implementations in C++, C#, Java, etc. which could serve as good references, if you're interested in seeing how other people have done it.

One simple architecture would model both nodes and connections as separate entities; nodes would have possible incoming and outgoing connections to other nodes as well as activation levels and error values, whereas connections would have weight values.

Alternatively, there are more efficient ways to represent those nodes and connections -- as arrays of floating point values organized by layer, for example. This makes things a bit trickier to code, but avoids creating so many objects and pointers to objects.

One note: often people will include a bias node -- in addition to the normal input nodes -- that provides a constant value to every hidden and output node.

浅沫记忆 2024-10-16 13:16:28

我之前已经实现过神经网络,并发现您提出的架构存在一些问题:

  1. 典型的多层网络具有从每个输入节点到每个隐藏节点的连接节点,以及从每个隐藏节点到每个输出节点。这使得来自所有输入的信息能够被组合起来并贡献于每个输出。如果您为每个输入指定 4 个隐藏节点,那么您将失去一些网络识别输入和输出之间关系的能力。

  2. 您将如何提出训练网络的值?您的网络在棋盘位置和最佳下一步行动之间创建映射,因此您需要一组提供此映射的训练示例。游戏结束时的走法很容易识别,但如何判断游戏中期的走法是“最佳”的呢? (强化学习可以在这里提供帮助)

最后一个建议是使用双极性输入(-1 表示错误,+1 表示正确),因为这可以加快学习速度。内特·科尔 (Nate Kohl) 提出了一个很好的观点:每一个隐藏的&输出节点将受益于偏置连接(将其视为具有固定值“1”的另一个输入节点)。

I've implemented neural networks before, and see a few problems with your proposed architecture:

  1. A typical multi-layer network has connections from every input node to every hidden node, and from every hidden node to every output node. This allows information from all of the inputs to be combined and contribute to each output. If you dedicate 4 hidden nodes to each input then you will losing some of the network's power to identify relationships between the inputs and outputs.

  2. How will you come up with values to train the network? Your network creates a mapping between board positions and the optimal next move, so you need a set of training examples that provide this. End game moves are easy to identify, but how do you tell that a mid-game move is "optimal"? (Reinforcement learning can help out here)

One last suggestion is to use bipolar inputs (-1 for false, +1 for true) since this can speed up learning. And Nate Kohl makes a good point: every hidden & output node will benefit from having a bias connection (think of it as another input node with a fixed value of "1").

给我一枪 2024-10-16 13:16:28

您的设计将高度依赖于您计划使用的强化学习的特定类型。

最简单的解决方案是使用反向传播。这是通过将误差反馈回网络(以相反的方式)并使用(S形)函数的反函数来确定对每个权重的调整来完成的。经过多次迭代后,权重将自动调整以适应输入。

遗传算法是反向传播的替代方法,它可以产生更好的结果(尽管速度有点慢)。这是通过将权重视为可以轻松插入和删除的模式来完成的。该模式会多次被变异版本替换(使用自然选择原理),直到找到合适的版本。

正如您所看到的,每一个的实现都会有很大的不同。您可以尝试使网络足够通用以适应每种类型的实现,但这可能会使它过于复杂。一旦投入生产,您通常只会接受一种形式的培训(或者理想情况下您的网络已经接受过培训)。

Your design will be highly dependant on the specific type of reinforcment learning that you plan to use.

The simplest solution would be to use back propogation. This is done by feeding the error back into the network (in reverse fashion) and using the inverse of the (sigmoid) function to determine the adjustment to each weight. After a number of iterations, the weights will automatically get adjusted to fit the input.

Genetic Algorithms are an alternative to back-propogation which yield better results (although a bit slower). This is done by treating the weights as a schema that can easily be inserted and removed. The schema is replaced with a mutated version (using principles of natural selection) several times until a fit is found.

As you can see, the implementation for each of these would be drastically different. You could try to make the network generic enough to adapt to each type of implementation but that may overcomplicate it. Once you are in production, you will usually only have one form of training (or ideally your network would already be trainined).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文