使用反向传播算法实现感知器
我正在尝试实现一个具有反向传播的两层感知器来解决奇偶校验问题。该网络有 4 个二进制输入,第一层有 4 个隐藏单元,第二层有 1 个输出。我正在使用 this 作为参考,但遇到了问题具有收敛性。
首先,我会注意到我使用 sigmoid 函数进行激活,因此导数是(根据我的理解)sigmoid(v) * (1 - sigmoid(v))。因此,在计算增量值时使用它。
因此,基本上我设置了网络并运行了几个时期(检查每种可能的模式 - 在本例中为 16 种输入模式)。第一个时期之后,权重略有变化。在第二次之后,无论我运行多少个纪元,权重都不会改变并保持不变。我现在使用的学习率为 0.1,偏差为 +1。
训练网络的过程如下伪代码所示(根据我检查的来源,我相信这是正确的):
前馈步骤:
v = SUM[weight connecting input to hidden * input value] + bias
y = Sigmoid(v)
set hidden.values to y
v = SUM[weight connecting hidden to output * hidden value] + bias
y = Sigmoid(v)
set output value to y
输出层的反向传播:
error = desired - output.value
outputDelta = error * output.value * (1 - output.value)
<隐藏层的反向传播:
for each hidden neuron h:
error = outputDelta * weight connecting h to output
hiddenDelta[i] = error * h.value * (1 - h.value)
更新权重:
for each hidden neuron h connected to the output layer
h.weight connecting h to output = learningRate * outputDelta * h.value
for each input neuron x connected to the hidden layer
x.weight connecting x to h[i] = learningRate * hiddenDelta[i] * x.value
这个过程当然会在历元中循环,并且权重变化会持续存在。所以,我的问题是,是否有任何原因使权重在第二个纪元之后保持不变?如果有必要,我可以发布我的代码,但目前我希望能看到一些我忽略的明显内容。谢谢大家!
编辑:这是 sarnold 建议的我的代码的链接:
MLP.java:http://codetidy.com/1903
Neuron.java:http://codetidy.com/1904
Pattern.java: http://codetidy.com/1905
input.txt:http://codetidy.com/1906
I am trying to implement a two-layer perceptron with backpropagation to solve the parity problem. The network has 4 binary inputs, 4 hidden units in the first layer and 1 output in the second layer. I am using this for reference, but am having problems with convergence.
First, I will note that I am using a sigmoid function for activation, and so the derivative is (from what I understand) the sigmoid(v) * (1 - sigmoid(v)). So, that is used when calculating the delta value.
So, basically I set up the network and run for just a few epochs (go through each possible pattern -- in this case, 16 patterns of input). After the first epoch, the weights are changed slightly. After the second, the weights do not change and remain so no matter how many more epochs I run. I am using a learning rate of 0.1 and a bias of +1 for now.
The process of training the network is below in pseudocode (which I believe to be correct according to sources I've checked):
Feed Forward Step:
v = SUM[weight connecting input to hidden * input value] + bias
y = Sigmoid(v)
set hidden.values to y
v = SUM[weight connecting hidden to output * hidden value] + bias
y = Sigmoid(v)
set output value to y
Backpropagation of Output Layer:
error = desired - output.value
outputDelta = error * output.value * (1 - output.value)
Backpropagation of Hidden Layer:
for each hidden neuron h:
error = outputDelta * weight connecting h to output
hiddenDelta[i] = error * h.value * (1 - h.value)
Update Weights:
for each hidden neuron h connected to the output layer
h.weight connecting h to output = learningRate * outputDelta * h.value
for each input neuron x connected to the hidden layer
x.weight connecting x to h[i] = learningRate * hiddenDelta[i] * x.value
This process is of course looped through the epochs and the weight changes persist. So, my question is, are there any reasons that the weights remain constant after the second epoch? If necessary I can post my code, but at the moment I am hoping for something obvious that I'm overlooking. Thanks all!
EDIT: Here are the links to my code as suggested by sarnold:
MLP.java: http://codetidy.com/1903
Neuron.java: http://codetidy.com/1904
Pattern.java: http://codetidy.com/1905
input.txt: http://codetidy.com/1906
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我想我发现了问题;有趣的是,我发现的内容在您的高级描述中可见,但我只发现代码中看起来很奇怪的内容。一、说明:
我相信
h.weight
应该根据之前的重量进行更新。您的更新机制仅根据学习率、输出增量和节点的值来设置它。同样,x.weight
也是根据学习率、隐藏增量和节点的值来设置的:我不知道是什么正确的解决方案是;但我有两个建议:
替换这些行:
用这些行:
(
+=
而不是=
。)替换这些行:
用这些行:
(忽略值并简单地缩放现有权重。此更改的学习率应为
1.05
而不是.05
。)I think I spotted the problem; funny enough, what I found is visible in your high-level description, but I only found what looked odd in the code. First, the description:
I believe the
h.weight
should be updated with respect to the previous weight. Your update mechanism sets it based only on the learning rate, the output delta, and the value of the node. Similarly, thex.weight
is also being set based on the learning rate, the hidden delta, and the value of the node:I do not know what the correct solution is; but I have two suggestions:
Replace these lines:
with these lines:
(
+=
instead of=
.)Replace these lines:
with these lines:
(Ignore the value and simply scale the existing weight. The learning rate should be
1.05
instead of.05
for this change.)