在训练我的感知器时,权重和阈值按比例增长是否有意义?
我正在迈出神经网络的第一步,为此我正在试验一个非常简单的单层、单输出感知器,它使用 S 型激活函数。每次呈现训练示例时,我都会使用以下方法在线更新权重:
weights += learningRate * (correct - result) * {input,1}
这里权重是一个 n 长度向量,其中还包含来自偏差神经元的权重(-阈值),结果 是给定输入
时感知器计算的结果(并使用 sigmoid 进行处理),Correct
是正确的结果,{input, 1}
是输入增强1(来自偏置神经元的固定输入)。现在,当我尝试训练感知器执行逻辑 AND 时,权重在很长一段时间内不会收敛,而是以类似的方式不断增长,并且与阈值保持大约 -1.5 的比率,例如三个权重位于序列:
5.067160008240718 5.105631826680446 -7.945513136885797
...
8.40390853077094 8.43890306970281 -12.889540730182592
我希望感知器停止在 1, 1, -1.5。
除了这个看起来与学习中缺少停止条件有关的问题之外,如果我尝试使用恒等函数作为激活函数,我会得到左右振荡的权重值:
0.43601272528257057 0.49092558197172703 -0.23106430854347537
并且我使用 tanh
。对此我无法给出解释。
谢谢
通努兹
I am moving my first steps in neural networks and to do so I am experimenting with a very simple single layer, single output perceptron which uses a sigmoidal activation function. I am updating my weights on-line each time a training example is presented using:
weights += learningRate * (correct - result) * {input,1}
Here weights
is a n-length vector which also contains the weight from the bias neuron (- threshold), result
is the result as computed by the perceptron (and processed using the sigmoid) when given the input
, correct
is the correct result and {input,1}
is the input augmented with 1 (the fixed input from the bias neuron). Now, when I try to train the perceptron to perform logic AND, the weights don't converge for a long time, instead they keep growing similarly and they maintain a ratio of circa -1.5 with the threshold, for instance the three weights are in sequence:
5.067160008240718 5.105631826680446 -7.945513136885797
...
8.40390853077094 8.43890306970281 -12.889540730182592
I would expect the perceptron to stop at 1, 1, -1.5.
Apart from this problem, which looks like connected to some missing stopping condition in the learning, if I try to use the identity function as activation function, I get weight values oscillating around:
0.43601272528257057 0.49092558197172703 -0.23106430854347537
and I obtain similar results with tanh
. I can't give an explanation to this.
Thank you
Tunnuz
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是因为即使输入非常高的正值(或负值),sigmoid 激活函数也不会达到 1(或零)。因此,
(正确 - 结果)
将始终为非零,并且您的权重将始终更新。尝试使用阶跃函数作为激活函数(即,当 x > 0 时,f(x) = 1,否则 f(x) = 0)。您的平均体重值似乎不适合身份激活功能。可能是你的学习率有点高——尝试降低它,看看是否会减少振荡的大小。
此外,在进行在线学习(又名随机梯度下降)时,通常的做法是随着时间的推移降低学习率,以便收敛到解决方案。否则你的体重将继续振荡。
当尝试分析感知行为时,查看
正确
和结果
也会有所帮助。It is because the sigmoid activation function doesn't reach one (or zero) even with very highly positive (or negative) inputs. So
(correct - result)
will always be non-zero, and your weights will always get updated. Try it with the step function as the activation function (i.e.f(x) = 1 for x > 0, f(x) = 0 otherwise
).Your average weight values don't seem right for the identity activation function. It might be that your learning rate is a little high -- try reducing it and see if that reduces the size of the oscillations.
Also, when doing online learning (aka stochastic gradient descent), it is common practice to reduce the learning rate over time so that you converge to a solution. Otherwise your weights will continue to oscillate.
When trying to analyze the behavior of the perception, it helps to also look at
correct
andresult
.