使用反向传播算法实现感知器

发布于 2025-01-02 21:04:18 字数 1889 浏览 2 评论 0原文

我正在尝试实现一个具有反向传播的两层感知器来解决奇偶校验问题。该网络有 4 个二进制输入,第一层有 4 个隐藏单元,第二层有 1 个输出。我正在使用 this 作为参考,但遇到了问题具有收敛性。

首先,我会注意到我使用 sigmoid 函数进行激活,因此导数是(根据我的理解)sigmoid(v) * (1 - sigmoid(v))。因此,在计算增量值时使用它。

因此,基本上我设置了网络并运行了几个时期(检查每种可能的模式 - 在本例中为 16 种输入模式)。第一个时期之后,权重略有变化。在第二次之后,无论我运行多少个纪元,权重都不会改变并保持不变。我现在使用的学习率为 0.1,偏差为 +1。

训练网络的过程如下伪代码所示(根据我检查的来源,我相信这是正确的):

前馈步骤:

v = SUM[weight connecting input to hidden * input value] + bias  
y = Sigmoid(v)  
set hidden.values to y  
v = SUM[weight connecting hidden to output * hidden value] + bias  
y = Sigmoid(v)  
set output value to y

输出层的反向传播:

error = desired - output.value  
outputDelta = error * output.value * (1 - output.value)

<隐藏层的反向传播:

for each hidden neuron h:  
error = outputDelta * weight connecting h to output  
hiddenDelta[i] = error * h.value * (1 - h.value)

更新权重:

for each hidden neuron h connected to the output layer  
h.weight connecting h to output = learningRate * outputDelta * h.value

for each input neuron x connected to the hidden layer  
x.weight connecting x to h[i] = learningRate * hiddenDelta[i] * x.value

这个过程当然会在历元中循环,并且权重变化会持续存在。所以,我的问题是,是否有任何原因使权重在第二个纪元之后保持不变?如果有必要,我可以发布我的代码,但目前我希望能看到一些我忽略的明显内容。谢谢大家!

编辑:这是 sarnold 建议的我的代码的链接:
MLP.java:http://codetidy.com/1903
Neuron.java:http://codetidy.com/1904
Pattern.java: http://codetidy.com/1905
input.txt:http://codetidy.com/1906

I am trying to implement a two-layer perceptron with backpropagation to solve the parity problem. The network has 4 binary inputs, 4 hidden units in the first layer and 1 output in the second layer. I am using this for reference, but am having problems with convergence.

First, I will note that I am using a sigmoid function for activation, and so the derivative is (from what I understand) the sigmoid(v) * (1 - sigmoid(v)). So, that is used when calculating the delta value.

So, basically I set up the network and run for just a few epochs (go through each possible pattern -- in this case, 16 patterns of input). After the first epoch, the weights are changed slightly. After the second, the weights do not change and remain so no matter how many more epochs I run. I am using a learning rate of 0.1 and a bias of +1 for now.

The process of training the network is below in pseudocode (which I believe to be correct according to sources I've checked):

Feed Forward Step:

v = SUM[weight connecting input to hidden * input value] + bias  
y = Sigmoid(v)  
set hidden.values to y  
v = SUM[weight connecting hidden to output * hidden value] + bias  
y = Sigmoid(v)  
set output value to y

Backpropagation of Output Layer:

error = desired - output.value  
outputDelta = error * output.value * (1 - output.value)

Backpropagation of Hidden Layer:

for each hidden neuron h:  
error = outputDelta * weight connecting h to output  
hiddenDelta[i] = error * h.value * (1 - h.value)

Update Weights:

for each hidden neuron h connected to the output layer  
h.weight connecting h to output = learningRate * outputDelta * h.value

for each input neuron x connected to the hidden layer  
x.weight connecting x to h[i] = learningRate * hiddenDelta[i] * x.value

This process is of course looped through the epochs and the weight changes persist. So, my question is, are there any reasons that the weights remain constant after the second epoch? If necessary I can post my code, but at the moment I am hoping for something obvious that I'm overlooking. Thanks all!

EDIT: Here are the links to my code as suggested by sarnold:
MLP.java: http://codetidy.com/1903
Neuron.java: http://codetidy.com/1904
Pattern.java: http://codetidy.com/1905
input.txt: http://codetidy.com/1906

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

墨小墨 2025-01-09 21:04:18

我想我发现了问题;有趣的是,我发现的内容在您的高级描述中可见,但我只发现代码中看起来很奇怪的内容。一、说明:

对于连接到输出层的每个隐藏神经元 h
h.weight 将 h 连接到输出 = LearningRate * outputDelta * h.value

对于连接到隐藏层的每个输入神经元 x
连接 x 和 h[i] 的 x.weight =learningRate *hiddenDelta[i]*x.value

我相信 h.weight 应该根据之前的重量进行更新。您的更新机制仅根据学习率、输出增量和节点的来设置它。同样,x.weight 也是根据学习率、隐藏增量和节点的来设置的:

    /*** Weight updates ***/

    // update weights connecting hidden neurons to output layer
    for (i = 0; i < output.size(); i++) {
        for (Neuron h : output.get(i).left) {
            h.weights[i] = learningRate * outputDelta[i] * h.value;
        }
    }

    // update weights connecting input neurons to hidden layer
    for (i = 0; i < hidden.size(); i++) {
        for (Neuron x : hidden.get(i).left) {
            x.weights[i] = learningRate * hiddenDelta[i] * x.value;
        }
    }

我不知道是什么正确的解决方案是;但我有两个建议:

  1. 替换这些行:

     h.weights[i] = LearningRate * outputDelta[i] * h.value;
            x.weights[i] = 学习率 *hiddenDelta[i] * x.value;
    

    用这些行:

     h.weights[i] +=learningRate * outputDelta[i] * h.value;
            x.weights[i] += 学习率 *hiddenDelta[i] * x.value;
    

    (+= 而不是 =。)

  2. 替换这些行:

     h.weights[i] = LearningRate * outputDelta[i] * h.value;
            x.weights[i] = 学习率 *hiddenDelta[i] * x.value;
    

    用这些行:

     h.weights[i] *= LearningRate * outputDelta[i];
            x.weights[i] *=learningRate *hiddenDelta[i];
    

    (忽略并简单地缩放现有权重。此更改的学习率应为1.05而不是.05。)

I think I spotted the problem; funny enough, what I found is visible in your high-level description, but I only found what looked odd in the code. First, the description:

for each hidden neuron h connected to the output layer
h.weight connecting h to output = learningRate * outputDelta * h.value

for each input neuron x connected to the hidden layer
x.weight connecting x to h[i] = learningRate * hiddenDelta[i] * x.value

I believe the h.weight should be updated with respect to the previous weight. Your update mechanism sets it based only on the learning rate, the output delta, and the value of the node. Similarly, the x.weight is also being set based on the learning rate, the hidden delta, and the value of the node:

    /*** Weight updates ***/

    // update weights connecting hidden neurons to output layer
    for (i = 0; i < output.size(); i++) {
        for (Neuron h : output.get(i).left) {
            h.weights[i] = learningRate * outputDelta[i] * h.value;
        }
    }

    // update weights connecting input neurons to hidden layer
    for (i = 0; i < hidden.size(); i++) {
        for (Neuron x : hidden.get(i).left) {
            x.weights[i] = learningRate * hiddenDelta[i] * x.value;
        }
    }

I do not know what the correct solution is; but I have two suggestions:

  1. Replace these lines:

            h.weights[i] = learningRate * outputDelta[i] * h.value;
            x.weights[i] = learningRate * hiddenDelta[i] * x.value;
    

    with these lines:

            h.weights[i] += learningRate * outputDelta[i] * h.value;
            x.weights[i] += learningRate * hiddenDelta[i] * x.value;
    

    (+= instead of =.)

  2. Replace these lines:

            h.weights[i] = learningRate * outputDelta[i] * h.value;
            x.weights[i] = learningRate * hiddenDelta[i] * x.value;
    

    with these lines:

            h.weights[i] *= learningRate * outputDelta[i];
            x.weights[i] *= learningRate * hiddenDelta[i];
    

    (Ignore the value and simply scale the existing weight. The learning rate should be 1.05 instead of .05 for this change.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文