神经网络反向传播,训练误差
在阅读了一些有关神经网络(反向传播)的文章后,我尝试自己编写一个简单的神经网络。
我决定使用异或神经网络, 我的问题是当我尝试训练网络时, 如果我只使用一个例子来训练网络,可以说 1,1,0(作为输入 1,输入 2,目标输出)。 500 趟列车后 +- 网络答案为 0.05。 但如果我尝试不止一个例子(比如说 2 个不同的或所有 4 种可能性),网络的目标是 0.5 作为输出:( 我在谷歌中搜索了我的错误但没有结果:S 我会尝试提供尽可能多的详细信息,以帮助找出问题所在:
-ive 尝试了 2,2,1 和 2,4,1(输入层,隐藏层,输出层)的网络。
- 每个神经元的输出定义为:
double input = 0.0;
for (int n = 0; n < layers[i].Count; n++)
input += layers[i][n].Output * weights[n];
而“i”是当前层,权重是前一层的所有权重。
-最后一层(输出层)误差定义为:
value*(1-value)*(targetvalue-value);
“value”是神经输出,“targetvalue”是当前神经的目标输出。
-其他神经网络的误差定义为:
foreach neural in the nextlayer
sum+=neural.value*currentneural.weights[neural];
-网络中的所有权重均按此公式调整(神经网络的权重 -> 神经网络 2),
weight+=LearnRate*neural.myvalue*neural2.error;
而 LearnRate 是网络学习率(在我的网络中定义为 0.25)。 -每个神经元的偏差权重定义为:
bias+=LearnRate*neural.myerror*neural.Bias;
偏差为 const value=1。
这几乎是我所能详细说明的一切, 正如我所说,不同训练示例的输出目标是 0.5 :(
非常感谢您的帮助 ^_^。
after reading some articles about neural network(back-propagation) i try to write a simple neural network by myself.
ive decided XOR neural-network,
my problem is when i am trying to train the network,
if i use only one example to train the network,lets say 1,1,0(as input1,input2,targetOutput).
after 500 trains +- the network answer 0.05.
but if im trying more then one example (lets say 2 different or all the 4 possibilities) the network aims to 0.5 as output :(
i searched in google for my mistakes with no results :S
ill try to give as much details as i can to help find what wrong:
-ive tried networks with 2,2,1 and 2,4,1 (inputlayer,hiddenlayer,outputlayer).
-the output for every neural defined by:
double input = 0.0;
for (int n = 0; n < layers[i].Count; n++)
input += layers[i][n].Output * weights[n];
while 'i' is the current layer and weight are all the weights from the previous layer.
-the last layer(output layer) error is defined by:
value*(1-value)*(targetvalue-value);
while 'value' is the neural output and 'targetvalue' is the target output for the current neural.
-the error for the others neurals define by:
foreach neural in the nextlayer
sum+=neural.value*currentneural.weights[neural];
-all the weights in the network are adapt by this formula(the weight from neural -> neural 2)
weight+=LearnRate*neural.myvalue*neural2.error;
while LearnRate is the nework learning rate(defined 0.25 at my network).
-the biasweight for each neural is defined by:
bias+=LearnRate*neural.myerror*neural.Bias;
bias is const value=1.
that pretty much all i can detail,
as i said the output aim to be 0.5 with different training examples :(
thank you very very much for your help ^_^.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果不看完整的代码,很难判断错误出在哪里。您应该仔细检查的一件事是,每个单元的局部误差梯度的计算是否与您在该层上使用的激活函数相匹配。看看这里的一般公式:http://www.learnartificialneuralnetworks.com/backpropagation.html .
例如,您对输出层进行的计算假设您正在使用逻辑 sigmoid 激活函数,但您在上面的代码中没有提及这一点,因此看起来您正在使用线性激活函数。
原则上,2-2-1 网络应该足以学习 XOR,尽管训练有时会陷入局部最小值而无法收敛到正确的状态。因此,重要的是不要从一次训练中得出关于算法性能的结论。请注意,简单的反向编程肯定会很慢,有更快、更强大的解决方案,例如 Rprop。
有一些关于该主题的书籍提供了简单网络的详细逐步计算(例如 Negnevitsky 的“AI:智能系统指南”),这可以帮助您调试算法。另一种方法是使用现有框架(例如 Encog、FANN、Matlab)设置完全相同的拓扑和初始权重,并将计算结果与您自己的实现进行比较。
It is difficult to tell where the error is without seeing the complete code. One thing you should carefully check is that your calculation of the local error gradient for each unit matches the activation function you are using on that layer. Have a look here for the general formula: http://www.learnartificialneuralnetworks.com/backpropagation.html .
For instance, the calculation you do for the output layer assumes that you are using a logistic sigmoid activation function but you don't mention that in the code above so it looks like you are using a linear activation function instead.
In principle a 2-2-1 network should be enough to learn XOR although the training will sometime get trapped into a local minimum without being able to converge to the correct state. So it is important not to draw conclusion about the performance of your algorithm from a single training session. Note that simple backprog is bound to be slow, there are faster and more robust solutions like Rprop for instance.
There are books on the subject which provide detailed step-by-step calculation for a simple network (e.g. 'A.I.: A guide to intelligent systems' by Negnevitsky), this could help you debug your algorithm. An alternative would be to use an existing framework (e.g. Encog, FANN, Matlab) set up the exact same topology and initial weights and compare the calculation with your own implementation.