反向传播实现问题
我应该做什么。我有一个黑白图像(100x100px):
我应该训练一个 使用该图像的反向传播神经网络。输入是图像的 x、y 坐标(从 0 到 99),输出是 1(白色)或 0(黑色)。
一旦网络学习完毕,我希望它能够根据权重重现图像,并获得最接近原始图像的图像。
这是我的反向传播实现:
import os
import math
import Image
import random
from random import sample
#------------------------------ class definitions
class Weight:
def __init__(self, fromNeuron, toNeuron):
self.value = random.uniform(-0.5, 0.5)
self.fromNeuron = fromNeuron
self.toNeuron = toNeuron
fromNeuron.outputWeights.append(self)
toNeuron.inputWeights.append(self)
self.delta = 0.0 # delta value, this will accumulate and after each training cycle used to adjust the weight value
def calculateDelta(self, network):
self.delta += self.fromNeuron.value * self.toNeuron.error
class Neuron:
def __init__(self):
self.value = 0.0 # the output
self.idealValue = 0.0 # the ideal output
self.error = 0.0 # error between output and ideal output
self.inputWeights = []
self.outputWeights = []
def activate(self, network):
x = 0.0;
for weight in self.inputWeights:
x += weight.value * weight.fromNeuron.value
# sigmoid function
if x < -320:
self.value = 0
elif x > 320:
self.value = 1
else:
self.value = 1 / (1 + math.exp(-x))
class Layer:
def __init__(self, neurons):
self.neurons = neurons
def activate(self, network):
for neuron in self.neurons:
neuron.activate(network)
class Network:
def __init__(self, layers, learningRate):
self.layers = layers
self.learningRate = learningRate # the rate at which the network learns
self.weights = []
for hiddenNeuron in self.layers[1].neurons:
for inputNeuron in self.layers[0].neurons:
self.weights.append(Weight(inputNeuron, hiddenNeuron))
for outputNeuron in self.layers[2].neurons:
self.weights.append(Weight(hiddenNeuron, outputNeuron))
def setInputs(self, inputs):
self.layers[0].neurons[0].value = float(inputs[0])
self.layers[0].neurons[1].value = float(inputs[1])
def setExpectedOutputs(self, expectedOutputs):
self.layers[2].neurons[0].idealValue = expectedOutputs[0]
def calculateOutputs(self, expectedOutputs):
self.setExpectedOutputs(expectedOutputs)
self.layers[1].activate(self) # activation function for hidden layer
self.layers[2].activate(self) # activation function for output layer
def calculateOutputErrors(self):
for neuron in self.layers[2].neurons:
neuron.error = (neuron.idealValue - neuron.value) * neuron.value * (1 - neuron.value)
def calculateHiddenErrors(self):
for neuron in self.layers[1].neurons:
error = 0.0
for weight in neuron.outputWeights:
error += weight.toNeuron.error * weight.value
neuron.error = error * neuron.value * (1 - neuron.value)
def calculateDeltas(self):
for weight in self.weights:
weight.calculateDelta(self)
def train(self, inputs, expectedOutputs):
self.setInputs(inputs)
self.calculateOutputs(expectedOutputs)
self.calculateOutputErrors()
self.calculateHiddenErrors()
self.calculateDeltas()
def learn(self):
for weight in self.weights:
weight.value += self.learningRate * weight.delta
def calculateSingleOutput(self, inputs):
self.setInputs(inputs)
self.layers[1].activate(self)
self.layers[2].activate(self)
#return round(self.layers[2].neurons[0].value, 0)
return self.layers[2].neurons[0].value
#------------------------------ initialize objects etc
inputLayer = Layer([Neuron() for n in range(2)])
hiddenLayer = Layer([Neuron() for n in range(10)])
outputLayer = Layer([Neuron() for n in range(1)])
learningRate = 0.4
network = Network([inputLayer, hiddenLayer, outputLayer], learningRate)
# let's get the training set
os.chdir("D:/stuff")
image = Image.open("backprop-input.gif")
pixels = image.load()
bbox = image.getbbox()
width = 5#bbox[2] # image width
height = 5#bbox[3] # image height
trainingInputs = []
trainingOutputs = []
b = w = 0
for x in range(0, width):
for y in range(0, height):
if (0, 0, 0, 255) == pixels[x, y]:
color = 0
b += 1
elif (255, 255, 255, 255) == pixels[x, y]:
color = 1
w += 1
trainingInputs.append([float(x), float(y)])
trainingOutputs.append([float(color)])
print "\nOriginal image ... Black:"+str(b)+" White:"+str(w)+"\n"
#------------------------------ let's train
for i in range(500):
for j in range(len(trainingOutputs)):
network.train(trainingInputs[j], trainingOutputs[j])
network.learn()
for w in network.weights:
w.delta = 0.0
#------------------------------ let's check
b = w = 0
for x in range(0, width):
for y in range(0, height):
out = network.calculateSingleOutput([float(x), float(y)])
if 0.0 == round(out):
color = (0, 0, 0, 255)
b += 1
elif 1.0 == round(out):
color = (255, 255, 255, 255)
w += 1
pixels[x, y] = color
#print out
print "\nAfter learning the network thinks ... Black:"+str(b)+" White:"+str(w)+"\n"
显然,我的实现存在一些问题。上面的代码返回:
原始图像...黑色:21 白色:4
学习网络后认为... 黑色:25 白色:0
如果我尝试使用更大的训练集(出于测试目的,我仅测试上图中的 25 个像素),它也会做同样的事情。它返回学习后所有像素都应该是黑色的。
现在,如果我使用这样的手动训练集:
trainingInputs = [
[0.0,0.0],
[1.0,0.0],
[2.0,0.0],
[0.0,1.0],
[1.0,1.0],
[2.0,1.0],
[0.0,2.0],
[1.0,2.0],
[2.0,2.0]
]
trainingOutputs = [
[0.0],
[1.0],
[1.0],
[0.0],
[1.0],
[0.0],
[0.0],
[0.0],
[1.0]
]
#------------------------------ let's train
for i in range(500):
for j in range(len(trainingOutputs)):
network.train(trainingInputs[j], trainingOutputs[j])
network.learn()
for w in network.weights:
w.delta = 0.0
#------------------------------ let's check
for inputs in trainingInputs:
print network.calculateSingleOutput(inputs)
输出例如:
0.0330125791296 # this should be 0, OK
0.953539182136 # this should be 1, OK
0.971854575477 # this should be 1, OK
0.00046146137467 # this should be 0, OK
0.896699762781 # this should be 1, OK
0.112909223162 # this should be 0, OK
0.00034058462280 # this should be 0, OK
0.0929886299643 # this should be 0, OK
0.940489647869 # this should be 1, OK
换句话说,网络正确猜测了所有像素(黑色和白色)。如果我使用图像中的实际像素而不是像上面那样的硬编码训练集,为什么它说所有像素都应该是黑色?
我尝试更改隐藏层中的神经元数量(最多 100 个神经元),但没有成功。
这是一份家庭作业。
这也是我之前关于反向传播的问题的延续。
What I am supposed to do. I have an black and white image (100x100px):
I am supposed to train a backpropagation neural network with this image. The inputs are x, y coordinates of the image (from 0 to 99) and output is either 1 (white color) or 0 (black color).
Once the network has learned, I would like it to reproduce the image based on its weights and get the closest possible image to the original.
Here is my backprop implementation:
import os
import math
import Image
import random
from random import sample
#------------------------------ class definitions
class Weight:
def __init__(self, fromNeuron, toNeuron):
self.value = random.uniform(-0.5, 0.5)
self.fromNeuron = fromNeuron
self.toNeuron = toNeuron
fromNeuron.outputWeights.append(self)
toNeuron.inputWeights.append(self)
self.delta = 0.0 # delta value, this will accumulate and after each training cycle used to adjust the weight value
def calculateDelta(self, network):
self.delta += self.fromNeuron.value * self.toNeuron.error
class Neuron:
def __init__(self):
self.value = 0.0 # the output
self.idealValue = 0.0 # the ideal output
self.error = 0.0 # error between output and ideal output
self.inputWeights = []
self.outputWeights = []
def activate(self, network):
x = 0.0;
for weight in self.inputWeights:
x += weight.value * weight.fromNeuron.value
# sigmoid function
if x < -320:
self.value = 0
elif x > 320:
self.value = 1
else:
self.value = 1 / (1 + math.exp(-x))
class Layer:
def __init__(self, neurons):
self.neurons = neurons
def activate(self, network):
for neuron in self.neurons:
neuron.activate(network)
class Network:
def __init__(self, layers, learningRate):
self.layers = layers
self.learningRate = learningRate # the rate at which the network learns
self.weights = []
for hiddenNeuron in self.layers[1].neurons:
for inputNeuron in self.layers[0].neurons:
self.weights.append(Weight(inputNeuron, hiddenNeuron))
for outputNeuron in self.layers[2].neurons:
self.weights.append(Weight(hiddenNeuron, outputNeuron))
def setInputs(self, inputs):
self.layers[0].neurons[0].value = float(inputs[0])
self.layers[0].neurons[1].value = float(inputs[1])
def setExpectedOutputs(self, expectedOutputs):
self.layers[2].neurons[0].idealValue = expectedOutputs[0]
def calculateOutputs(self, expectedOutputs):
self.setExpectedOutputs(expectedOutputs)
self.layers[1].activate(self) # activation function for hidden layer
self.layers[2].activate(self) # activation function for output layer
def calculateOutputErrors(self):
for neuron in self.layers[2].neurons:
neuron.error = (neuron.idealValue - neuron.value) * neuron.value * (1 - neuron.value)
def calculateHiddenErrors(self):
for neuron in self.layers[1].neurons:
error = 0.0
for weight in neuron.outputWeights:
error += weight.toNeuron.error * weight.value
neuron.error = error * neuron.value * (1 - neuron.value)
def calculateDeltas(self):
for weight in self.weights:
weight.calculateDelta(self)
def train(self, inputs, expectedOutputs):
self.setInputs(inputs)
self.calculateOutputs(expectedOutputs)
self.calculateOutputErrors()
self.calculateHiddenErrors()
self.calculateDeltas()
def learn(self):
for weight in self.weights:
weight.value += self.learningRate * weight.delta
def calculateSingleOutput(self, inputs):
self.setInputs(inputs)
self.layers[1].activate(self)
self.layers[2].activate(self)
#return round(self.layers[2].neurons[0].value, 0)
return self.layers[2].neurons[0].value
#------------------------------ initialize objects etc
inputLayer = Layer([Neuron() for n in range(2)])
hiddenLayer = Layer([Neuron() for n in range(10)])
outputLayer = Layer([Neuron() for n in range(1)])
learningRate = 0.4
network = Network([inputLayer, hiddenLayer, outputLayer], learningRate)
# let's get the training set
os.chdir("D:/stuff")
image = Image.open("backprop-input.gif")
pixels = image.load()
bbox = image.getbbox()
width = 5#bbox[2] # image width
height = 5#bbox[3] # image height
trainingInputs = []
trainingOutputs = []
b = w = 0
for x in range(0, width):
for y in range(0, height):
if (0, 0, 0, 255) == pixels[x, y]:
color = 0
b += 1
elif (255, 255, 255, 255) == pixels[x, y]:
color = 1
w += 1
trainingInputs.append([float(x), float(y)])
trainingOutputs.append([float(color)])
print "\nOriginal image ... Black:"+str(b)+" White:"+str(w)+"\n"
#------------------------------ let's train
for i in range(500):
for j in range(len(trainingOutputs)):
network.train(trainingInputs[j], trainingOutputs[j])
network.learn()
for w in network.weights:
w.delta = 0.0
#------------------------------ let's check
b = w = 0
for x in range(0, width):
for y in range(0, height):
out = network.calculateSingleOutput([float(x), float(y)])
if 0.0 == round(out):
color = (0, 0, 0, 255)
b += 1
elif 1.0 == round(out):
color = (255, 255, 255, 255)
w += 1
pixels[x, y] = color
#print out
print "\nAfter learning the network thinks ... Black:"+str(b)+" White:"+str(w)+"\n"
Obviously, there is some issue with my implementation. The above code returns:
Original image ... Black:21 White:4
After learning the network thinks ...
Black:25 White:0
It does the same thing if I try to use larger training set (I'm testing just 25 pixels from the image above for testing purposes). It returns that all pixels should be black after learning.
Now, if I use a manual training set like this instead:
trainingInputs = [
[0.0,0.0],
[1.0,0.0],
[2.0,0.0],
[0.0,1.0],
[1.0,1.0],
[2.0,1.0],
[0.0,2.0],
[1.0,2.0],
[2.0,2.0]
]
trainingOutputs = [
[0.0],
[1.0],
[1.0],
[0.0],
[1.0],
[0.0],
[0.0],
[0.0],
[1.0]
]
#------------------------------ let's train
for i in range(500):
for j in range(len(trainingOutputs)):
network.train(trainingInputs[j], trainingOutputs[j])
network.learn()
for w in network.weights:
w.delta = 0.0
#------------------------------ let's check
for inputs in trainingInputs:
print network.calculateSingleOutput(inputs)
The output is for example:
0.0330125791296 # this should be 0, OK
0.953539182136 # this should be 1, OK
0.971854575477 # this should be 1, OK
0.00046146137467 # this should be 0, OK
0.896699762781 # this should be 1, OK
0.112909223162 # this should be 0, OK
0.00034058462280 # this should be 0, OK
0.0929886299643 # this should be 0, OK
0.940489647869 # this should be 1, OK
In other words the network guessed all pixels right (both black and white). Why does it say all pixels should be black if I use actual pixels from the image instead of hard coded training set like the above?
I tried changing the amount of neurons in the hidden layers (up to 100 neurons) with no success.
This is a homework.
This is also a continuation of my previous question about backprop.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
已经有一段时间了,但我确实获得了这方面的学位,所以我想希望其中的一些内容能够保留下来。
据我所知,输入集使中间层神经元超载过深。也就是说,您的输入集由 10,000 个离散输入值 (100 pix x 100 pix) 组成;你正在尝试将这 10,000 个值编码到 10 个神经元中。这种级别的编码很难(我怀疑这是可能的,但肯定很难);至少,你需要大量的训练(超过 500 次运行)才能让它合理地重现。即使中间层有 100 个神经元,您也会看到相对密集的压缩级别(100 个像素对 1 个神经元)。
对于这些问题该怎么办;好吧,这很棘手。你可以大幅增加中间神经元的数量,并且会得到合理的效果,但当然需要很长时间来训练。但是,我认为可能有不同的解决方案;如果可能,您可以考虑使用极坐标而不是笛卡尔坐标作为输入;快速观察输入模式表明高度对称,并且实际上您会看到沿角坐标具有重复可预测变形的线性模式,这似乎可以在少量中间层神经元中很好地编码。
这件事很棘手;寻求模式编码的通用解决方案(就像您的原始解决方案一样)非常复杂,并且通常(即使有大量的中间层神经元)需要大量的训练过程;另一方面,一些预先的启发式任务分解和一些问题重新定义(即预先从笛卡尔坐标转换为极坐标)可以为明确定义的问题集提供良好的解决方案。当然,其中存在着永久的摩擦。通用的解决方案很难找到,但稍微更具体的解决方案确实会很好。
无论如何,有趣的东西!
It's been a while, but I did get my degree in this stuff, so I think hopefully some of it has stuck.
From what I can tell, you're too deeply overloading your middle layer neurons with the input set. That is, your input set consists of 10,000 discrete input values (100 pix x 100 pix); you're attempting to encode those 10,000 values into 10 neurons. This level of encoding is hard (I suspect it's possible, but certainly hard); at the least, you'd need a LOT of training (more than 500 runs) to get it to reproduce reasonably. Even with 100 neurons for the middle layer, you're looking at a relatively dense compression level going on (100 pixels to 1 neuron).
As to what to do about these problems; well, that's tricky. You can increase your number of middle neurons dramatically, and you'll get a reasonable effect, but of course it'll take a long time to train. However, I think there might be a different solution; if possible, you might consider using polar coordinates instead of cartesian coordinates for the input; quick eyeballing of the input pattern indicates a high level of symmetry, and effectively you'd be looking at a linear pattern with a repeated predictable deformation along the angular coordinate, which it seems would encode nicely in a small number of middle layer neurons.
This stuff is tricky; going for a general solution for pattern encoding (as your original solution does) is very complex, and can usually (even with large numbers of middle layer neurons) require a lot of training passes; on the other hand, some advance heuristic task breakdown and a little bit of problem redefinition (i.e. advance converting from cartesian to polar coordinates) can give good solutions for well defined problem sets. Therein, of course, is the perpetual rub; general solutions are hard to come by, but slightly more specified solutions can be quite nice indeed.
Interesting stuff, in any event!