基于Java的神经网络---如何实现反向传播
我正在构建一个测试神经网络,但它肯定不起作用。我的主要问题是反向传播。根据我的研究,我知道使用 sigmoid 函数很容易。因此,我通过 (1-Output)(Output)(target-Output) 更新每个权重,但问题是如果我的输出为 1 但我的目标不是?如果在某个时刻它是 1,那么权重更新将始终为 0...现在我只是想把来自 2 个输入神经元的输入相加,所以最佳权重应该是 1 作为输出神经元只需添加其输入即可。我确信我在很多地方都搞砸了,但这是我的代码:
public class Main {
public static void main(String[] args) {
Double[] inputs = {1.0, 2.0};
ArrayList<Double> answers = new ArrayList<Double>();
answers.add(3.0);
net myNeuralNet = new net(2, 1, answers);
for(int i=0; i<200; i++){
myNeuralNet.setInputs(inputs);
myNeuralNet.start();
myNeuralNet.backpropagation();
myNeuralNet.printOutput();
System.out.println("*****");
for(int j=0; j<myNeuralNet.getOutputs().size(); j++){
myNeuralNet.getOutputs().get(j).resetInput();
myNeuralNet.getOutputs().get(j).resetOutput();
myNeuralNet.getOutputs().get(j).resetNumCalled();
}
}
}
}
package myneuralnet;
import java.util.ArrayList;
public class net {
private ArrayList<neuron> inputLayer;
private ArrayList<neuron> outputLayer;
private ArrayList<Double> answers;
public net(Integer numInput, Integer numOut, ArrayList<Double> answers){
inputLayer = new ArrayList<neuron>();
outputLayer = new ArrayList<neuron>();
this.answers = answers;
for(int i=0; i<numOut; i++){
outputLayer.add(new neuron(true));
}
for(int i=0; i<numInput; i++){
ArrayList<Double> randomWeights = createRandomWeights(numInput);
inputLayer.add(new neuron(outputLayer, randomWeights, -100.00, true));
}
for(int i=0; i<numOut; i++){
outputLayer.get(i).setBackConn(inputLayer);
}
}
public ArrayList<neuron> getOutputs(){
return outputLayer;
}
public void backpropagation(){
for(int i=0; i<answers.size(); i++){
neuron iOut = outputLayer.get(i);
ArrayList<neuron> iOutBack = iOut.getBackConn();
Double iSigDeriv = (1-iOut.getOutput())*iOut.getOutput();
Double iError = (answers.get(i) - iOut.getOutput());
System.out.println("Answer: "+answers.get(i) + " iOut: "+iOut.getOutput()+" Error: "+iError+" Sigmoid: "+iSigDeriv);
for(int j=0; j<iOutBack.size(); j++){
neuron jNeuron = iOutBack.get(j);
Double ijWeight = jNeuron.getWeight(i);
System.out.println("ijWeight: "+ijWeight);
System.out.println("jNeuronOut: "+jNeuron.getOutput());
jNeuron.setWeight(i, ijWeight+(iSigDeriv*iError*jNeuron.getOutput()));
}
}
for(int i=0; i<inputLayer.size(); i++){
inputLayer.get(i).resetInput();
inputLayer.get(i).resetOutput();
}
}
public ArrayList<Double> createRandomWeights(Integer size){
ArrayList<Double> iWeight = new ArrayList<Double>();
for(int i=0; i<size; i++){
Double randNum = (2*Math.random())-1;
iWeight.add(randNum);
}
return iWeight;
}
public void setInputs(Double[] is){
for(int i=0; i<is.length; i++){
inputLayer.get(i).setInput(is[i]);
}
for(int i=0; i<outputLayer.size(); i++){
outputLayer.get(i).resetInput();
}
}
public void start(){
for(int i=0; i<inputLayer.size(); i++){
inputLayer.get(i).fire();
}
}
public void printOutput(){
for(int i=0; i<outputLayer.size(); i++){
System.out.println(outputLayer.get(i).getOutput().toString());
}
}
}
package myneuralnet;
import java.util.ArrayList;
public class neuron {
private ArrayList<neuron> connections;
private ArrayList<neuron> backconns;
private ArrayList<Double> weights;
private Double threshold;
private Double input;
private Boolean isOutput = false;
private Boolean isInput = false;
private Double totalSignal;
private Integer numCalled;
private Double myOutput;
public neuron(ArrayList<neuron> conns, ArrayList<Double> weights, Double threshold){
this.connections = conns;
this.weights = weights;
this.threshold = threshold;
this.totalSignal = 0.00;
this.numCalled = 0;
this.backconns = new ArrayList<neuron>();
this.input = 0.00;
}
public neuron(ArrayList<neuron> conns, ArrayList<Double> weights, Double threshold, Boolean isin){
this.connections = conns;
this.weights = weights;
this.threshold = threshold;
this.totalSignal = 0.00;
this.numCalled = 0;
this.backconns = new ArrayList<neuron>();
this.input = 0.00;
this.isInput = isin;
}
public neuron(Boolean tf){
this.connections = new ArrayList<neuron>();
this.weights = new ArrayList<Double>();
this.threshold = 0.00;
this.totalSignal = 0.00;
this.numCalled = 0;
this.isOutput = tf;
this.backconns = new ArrayList<neuron>();
this.input = 0.00;
}
public void setInput(Double input){
this.input = input;
}
public void setOut(Boolean tf){
this.isOutput = tf;
}
public void resetNumCalled(){
numCalled = 0;
}
public void setBackConn(ArrayList<neuron> backs){
this.backconns = backs;
}
public Double getOutput(){
return myOutput;
}
public Double getInput(){
return totalSignal;
}
public Double getRealInput(){
return input;
}
public ArrayList<Double> getWeights(){
return weights;
}
public ArrayList<neuron> getBackConn(){
return backconns;
}
public Double getWeight(Integer i){
return weights.get(i);
}
public void setWeight(Integer i, Double d){
weights.set(i, d);
}
public void setOutput(Double d){
myOutput = d;
}
public void activation(Double myInput){
numCalled++;
totalSignal += myInput;
if(numCalled==backconns.size() && isOutput){
System.out.println("Total Sig: "+totalSignal);
setInput(totalSignal);
setOutput(totalSignal);
}
}
public void activation(){
Double activationValue = 1 / (1 + Math.exp(input));
setInput(activationValue);
fire();
}
public void fire(){
for(int i=0; i<connections.size(); i++){
Double iWeight = weights.get(i);
neuron iConn = connections.get(i);
myOutput = (1/(1+(Math.exp(-input))))*iWeight;
iConn.activation(myOutput);
}
}
public void resetInput(){
input = 0.00;
totalSignal = 0.00;
}
public void resetOutput(){
myOutput = 0.00;
}
}
好的,这是很多代码,所以请允许我解释一下。现在网络很简单,只有一个输入层和一个输出层——我想稍后添加一个隐藏层,但现在我正在采取一些小步骤。每层都是神经元的数组列表。输入神经元加载有输入,在此示例中为 1 和 2。这些神经元触发,计算输入的 sigmoid 并将其输出到输出神经元,输出神经元将它们相加并存储值。然后网络通过获取(答案输出)(输出)(1-输出)(特定输入神经元的输出)进行反向传播,并相应地更新权重。很多时候,它循环下去,我得到无穷大,这似乎与负权重或 sigmoid 相关。当这种情况没有发生时,它会收敛到 1,并且由于(1-1 的输出)为 0,我的权重停止更新。
numCalled 和totalSignal 值只是让算法在继续之前等待所有神经元输入。我知道我这样做的方式很奇怪,但是神经元类有一个称为连接的神经元数组列表,用于保存它正向连接的神经元。另一个名为 backconns 的数组列表保存向后连接。我也应该更新正确的权重,因为我得到了神经元 i 和 j 之间的所有反向连接,但对于所有神经元 j(i 上面的层),我只拉动权重 i。对于造成的混乱,我深表歉意——我已经连续几个小时尝试了很多事情,但仍然无法弄清楚。非常感谢任何帮助!
I am building a test neural network and it is definitely not working. My main problem is backpropagation. From my research, I know that it is easy to use the sigmoid function. Therefore, I update each weight by (1-Output)(Output)(target-Output) but the problem with this is what if my Output is 1 but my target is not? If it is one at some point then the weight update will always be 0...For now I am just trying to get the darn thing to add the inputs from 2 input neurons, so the optimal weights should just be 1 as the output neuron simply adds its inputs. I'm sure I have messed this up in lots of places but here is my code:
public class Main {
public static void main(String[] args) {
Double[] inputs = {1.0, 2.0};
ArrayList<Double> answers = new ArrayList<Double>();
answers.add(3.0);
net myNeuralNet = new net(2, 1, answers);
for(int i=0; i<200; i++){
myNeuralNet.setInputs(inputs);
myNeuralNet.start();
myNeuralNet.backpropagation();
myNeuralNet.printOutput();
System.out.println("*****");
for(int j=0; j<myNeuralNet.getOutputs().size(); j++){
myNeuralNet.getOutputs().get(j).resetInput();
myNeuralNet.getOutputs().get(j).resetOutput();
myNeuralNet.getOutputs().get(j).resetNumCalled();
}
}
}
}
package myneuralnet;
import java.util.ArrayList;
public class net {
private ArrayList<neuron> inputLayer;
private ArrayList<neuron> outputLayer;
private ArrayList<Double> answers;
public net(Integer numInput, Integer numOut, ArrayList<Double> answers){
inputLayer = new ArrayList<neuron>();
outputLayer = new ArrayList<neuron>();
this.answers = answers;
for(int i=0; i<numOut; i++){
outputLayer.add(new neuron(true));
}
for(int i=0; i<numInput; i++){
ArrayList<Double> randomWeights = createRandomWeights(numInput);
inputLayer.add(new neuron(outputLayer, randomWeights, -100.00, true));
}
for(int i=0; i<numOut; i++){
outputLayer.get(i).setBackConn(inputLayer);
}
}
public ArrayList<neuron> getOutputs(){
return outputLayer;
}
public void backpropagation(){
for(int i=0; i<answers.size(); i++){
neuron iOut = outputLayer.get(i);
ArrayList<neuron> iOutBack = iOut.getBackConn();
Double iSigDeriv = (1-iOut.getOutput())*iOut.getOutput();
Double iError = (answers.get(i) - iOut.getOutput());
System.out.println("Answer: "+answers.get(i) + " iOut: "+iOut.getOutput()+" Error: "+iError+" Sigmoid: "+iSigDeriv);
for(int j=0; j<iOutBack.size(); j++){
neuron jNeuron = iOutBack.get(j);
Double ijWeight = jNeuron.getWeight(i);
System.out.println("ijWeight: "+ijWeight);
System.out.println("jNeuronOut: "+jNeuron.getOutput());
jNeuron.setWeight(i, ijWeight+(iSigDeriv*iError*jNeuron.getOutput()));
}
}
for(int i=0; i<inputLayer.size(); i++){
inputLayer.get(i).resetInput();
inputLayer.get(i).resetOutput();
}
}
public ArrayList<Double> createRandomWeights(Integer size){
ArrayList<Double> iWeight = new ArrayList<Double>();
for(int i=0; i<size; i++){
Double randNum = (2*Math.random())-1;
iWeight.add(randNum);
}
return iWeight;
}
public void setInputs(Double[] is){
for(int i=0; i<is.length; i++){
inputLayer.get(i).setInput(is[i]);
}
for(int i=0; i<outputLayer.size(); i++){
outputLayer.get(i).resetInput();
}
}
public void start(){
for(int i=0; i<inputLayer.size(); i++){
inputLayer.get(i).fire();
}
}
public void printOutput(){
for(int i=0; i<outputLayer.size(); i++){
System.out.println(outputLayer.get(i).getOutput().toString());
}
}
}
package myneuralnet;
import java.util.ArrayList;
public class neuron {
private ArrayList<neuron> connections;
private ArrayList<neuron> backconns;
private ArrayList<Double> weights;
private Double threshold;
private Double input;
private Boolean isOutput = false;
private Boolean isInput = false;
private Double totalSignal;
private Integer numCalled;
private Double myOutput;
public neuron(ArrayList<neuron> conns, ArrayList<Double> weights, Double threshold){
this.connections = conns;
this.weights = weights;
this.threshold = threshold;
this.totalSignal = 0.00;
this.numCalled = 0;
this.backconns = new ArrayList<neuron>();
this.input = 0.00;
}
public neuron(ArrayList<neuron> conns, ArrayList<Double> weights, Double threshold, Boolean isin){
this.connections = conns;
this.weights = weights;
this.threshold = threshold;
this.totalSignal = 0.00;
this.numCalled = 0;
this.backconns = new ArrayList<neuron>();
this.input = 0.00;
this.isInput = isin;
}
public neuron(Boolean tf){
this.connections = new ArrayList<neuron>();
this.weights = new ArrayList<Double>();
this.threshold = 0.00;
this.totalSignal = 0.00;
this.numCalled = 0;
this.isOutput = tf;
this.backconns = new ArrayList<neuron>();
this.input = 0.00;
}
public void setInput(Double input){
this.input = input;
}
public void setOut(Boolean tf){
this.isOutput = tf;
}
public void resetNumCalled(){
numCalled = 0;
}
public void setBackConn(ArrayList<neuron> backs){
this.backconns = backs;
}
public Double getOutput(){
return myOutput;
}
public Double getInput(){
return totalSignal;
}
public Double getRealInput(){
return input;
}
public ArrayList<Double> getWeights(){
return weights;
}
public ArrayList<neuron> getBackConn(){
return backconns;
}
public Double getWeight(Integer i){
return weights.get(i);
}
public void setWeight(Integer i, Double d){
weights.set(i, d);
}
public void setOutput(Double d){
myOutput = d;
}
public void activation(Double myInput){
numCalled++;
totalSignal += myInput;
if(numCalled==backconns.size() && isOutput){
System.out.println("Total Sig: "+totalSignal);
setInput(totalSignal);
setOutput(totalSignal);
}
}
public void activation(){
Double activationValue = 1 / (1 + Math.exp(input));
setInput(activationValue);
fire();
}
public void fire(){
for(int i=0; i<connections.size(); i++){
Double iWeight = weights.get(i);
neuron iConn = connections.get(i);
myOutput = (1/(1+(Math.exp(-input))))*iWeight;
iConn.activation(myOutput);
}
}
public void resetInput(){
input = 0.00;
totalSignal = 0.00;
}
public void resetOutput(){
myOutput = 0.00;
}
}
OK so that is a lot of code so allow me to explain. The net is simple for now, just an input layer and an output layer --- I want to add a hidden layer later but I'm taking baby steps for now. Each layer is an arraylist of neurons. Input neurons are loaded with inputs, a 1 and a 2 in this example. These neurons fire, which calculates the sigmoid of the inputs and outputs that to the output neurons, which adds them and stores the value. Then the net backpropagates by taking the (answer-output)(output)(1-output)(output of the specific input neuron) and updates the weights accordingly. A lot of times, it cycles through and I get infinity, which seems to correlate with negative weights or sigmoid. When that doesn't happen it converges to 1 and since (1-output of 1) is 0, my weights stop updating.
The numCalled and totalSignal values are just so the algorithm waits for all neuron inputs before continuing. I know I'm doing this an odd way, but the neuron class has an arraylist of neurons called connections to hold the neurons that it is forward connected to. Another arraylist called backconns holds the backward connections. I should be updating the correct weights as well since I am getting all back connections between neurons i and j but of all neurons j (the layer above i) I am only pulling weight i. I apologize for the messiness --- I've been trying lots of things for hours upon hours now and still cannot figure it out. Any help is greatly appreciated!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
克里斯·毕肖普 (Chris Bishop) 和西蒙·海金 (Simon Haykin) 的著作是一些关于神经网络的最佳教科书。尝试阅读有关反向传播的章节,并理解为什么权重更新规则中的术语是这样的。我要求您这样做的原因是反向传播比乍一看更微妙。如果您对输出层使用线性激活函数(想想为什么要这样做。提示:后处理),或者如果您添加隐藏层,情况会发生一些变化。当我真正读完这本书后,我就更清楚了。
Some of the best textbooks on neural networks in general are Chris Bishop's and Simon Haykin's. Try reading through the chapter on backprop and understand why the terms in the weight update rule are the way they are.The reason why I am asking you to do that is that backprop is more subtle than it seems at first. Things change a bit if you use a linear activation function for the output layer (think about why you might want to do that. Hint: post-processing), or if you add a hidden layer. It got clearer for me when I actually read the book.
您可能想将您的代码与这个单层感知器进行比较。
我认为你的反向传播算法有一个错误。另外,尝试用方波代替 sigmoid。
http://web.archive.org /web/20101228185321/http://en.literateprograms.org/Perceptron_%28Java%29
You might want to compare your code to this single layer perceptron.
I think you have a bug in your backprop algo. Also, try replacing the sigmoid with a squarewave.
http://web.archive.org/web/20101228185321/http://en.literateprograms.org/Perceptron_%28Java%29
sigmoid 函数 1/(1 + Math.exp(-x)) 永远不会等于 1。当 x 接近无穷大时,lim 等于 0,但这是一个水平渐近线,因此该函数实际上永远不会接触 1。因此,如果这个表达式用于计算所有的输出值,那么你的输出永远不会是 1。所以 (1 - 输出) 不应该等于 0。
我认为你的问题是在计算输出期间。对于神经网络,每个神经元的输出通常是 sigmoid(输入和权重的点积)。换句话说,值=输入1*权重1+输入2*权重2+...(对于神经元的每个权重)+biasWeight。那么该神经元的输出 = 1 / (1 + Math.exp(-value)。如果以这种方式计算,输出将永远不会等于 1。
The sigmoid function 1/(1 + Math.exp(-x)) never equates to 1. The lim as x approaches infinity is equal to 0, but this is a horizontal asymptote, so the function never actually touches 1. Therefore, if this expression is used to compute all of your output values, then your output will never be 1. So (1 - output) shouldn't ever equal 0.
I think your issue is during the calculation of the output. For a neural network, the output for each neuron is typically sigmoid(dot product of inputs and weights). In other words, value = input1 * weight1 + input2 * weight2 + ... (for each weight of neuron) + biasWeight. Then that neuron's output = 1 / (1 + Math.exp(-value). If it's calculated in this way, the output won't ever be equal to 1.