为什么我会得到一个充满 NaN 的权重矩阵?
(赫布学习)
我的任务是在 Matlab 中编写 Oja 学习规则和 Sanger 学习规则,以训练神经网络。这个神经网络有 6 个输入和 4 个输出,我的训练集来自多元均匀分布,例如 Xi ~ U(-ai,ai) 和 ai≠aj,对于所有 i≠j
这些是最相关的文件(大多数评论)和 oja.m 不包括在内)
main.m
TS = generarVectoresUnif(6, [1, 4, 9, 36, 25, 16], 512);
TS = TS';
W = unifrnd(0,1,[4,6]);
% it not very fast. That's why I put 500 iterations
W_sanger = sanger(W,TS,500, 0.05)
通用向量Unif.m
function [ TS ] = generarVectoresUnif( dim, rangos, n )
dimensiones = int8(dim);
tamanio = int32(n);
TS = [];
for i = 1:dimensiones
TS = [TS, unifrnd(-rangos(i), rangos(i), [tamanio, 1]) ];
end
桑格.m
(注: W 是一个 4 x 6 大小的矩阵。 Wi 是第 i 个输出的权重向量。 Wij = (Wi)j。 在示例中,TS 是一个 6 x 512 大小的矩阵 )
function [ W ] = sanger( W_init, trainingset, iteraciones , eta)
W = W_init;
% obtiene los tamaños desde los parametros de entrada
size_input = size(W,2);
size_output = size(W,1);
n_patterns = size(trainingset, 2);
% one-tenth part
diezmo = iteraciones/10;
for it = 1:iteraciones
if 0 == mod(it, diezmo)
disp(horzcat('Iteracion numero ', num2str(it), ' de ',num2str(iteraciones)));
end
% for each pattern
for u = 1:n_patrones
DeltaW = zeros(size(W));
% Vi = sum{j=1...N} Wij * Xj
V = W * trainingset(:,u);
% sumatorias(i,j) is going to replace sum{k=1..i} Vk*Wkj
sumatorias = zeros(size_output,size_input);
for j = 1:size_input
for k = 1:size_output
% sumar de 1 hasta i, sin hacer otro ciclo
sumatorias(k,j) = (V' .* [ones(1,k), zeros(1,size_output-k)]) * W(:,j);
end
end
% calcula la variacion
for i = 1:size_output
for j=1:size_input
% Delta Wij = eta * Vi * ( xj - sum{k=1..i} Vk*Wkj )
DeltaW(i,j) = eta * V(i,1) * (trainingset(j,u) - sumatorias(i,j));
end
end
W = W + DeltaW;
%W = 1/norm(W) * W; %<---is it necessary? [Hertz] doesn't mention it
end
end
你能告诉我我做错了什么吗?矩阵的值增长得非常快。 我对 oja.m 有同样的问题
,我尝试过:
- 用 1/it 替换 eta --->NaN
- 用 eta 替换迭代次数的指数函数 ---> 好的,但这不是我所期望的
- 取消注释 W = 1/norm(W) * W;。这确实有效,但应该没有必要,或者应该吗?
(Hebbian learning)
I was given the task of programming the Oja's learning rule and Sanger's learning rule in Matlab, to train a neural network. This NN has 6 inputs and 4 outputs, and my training set comes from a multivariate uniform distribution, such as Xi ~ U(-ai,ai) and ai≠aj, for all i≠j
These are the most relevant files (Most comments and oja.m were not included)
main.m
TS = generarVectoresUnif(6, [1, 4, 9, 36, 25, 16], 512);
TS = TS';
W = unifrnd(0,1,[4,6]);
% it not very fast. That's why I put 500 iterations
W_sanger = sanger(W,TS,500, 0.05)
generarVectoresUnif.m
function [ TS ] = generarVectoresUnif( dim, rangos, n )
dimensiones = int8(dim);
tamanio = int32(n);
TS = [];
for i = 1:dimensiones
TS = [TS, unifrnd(-rangos(i), rangos(i), [tamanio, 1]) ];
end
sanger.m
( NOTE:
W is a 4 x 6 size matrix.
Wi is the weight vector for the i-th output.
Wij = (Wi)j.
In the example, TS is a 6 x 512 size matrix
)
function [ W ] = sanger( W_init, trainingset, iteraciones , eta)
W = W_init;
% obtiene los tamaños desde los parametros de entrada
size_input = size(W,2);
size_output = size(W,1);
n_patterns = size(trainingset, 2);
% one-tenth part
diezmo = iteraciones/10;
for it = 1:iteraciones
if 0 == mod(it, diezmo)
disp(horzcat('Iteracion numero ', num2str(it), ' de ',num2str(iteraciones)));
end
% for each pattern
for u = 1:n_patrones
DeltaW = zeros(size(W));
% Vi = sum{j=1...N} Wij * Xj
V = W * trainingset(:,u);
% sumatorias(i,j) is going to replace sum{k=1..i} Vk*Wkj
sumatorias = zeros(size_output,size_input);
for j = 1:size_input
for k = 1:size_output
% sumar de 1 hasta i, sin hacer otro ciclo
sumatorias(k,j) = (V' .* [ones(1,k), zeros(1,size_output-k)]) * W(:,j);
end
end
% calcula la variacion
for i = 1:size_output
for j=1:size_input
% Delta Wij = eta * Vi * ( xj - sum{k=1..i} Vk*Wkj )
DeltaW(i,j) = eta * V(i,1) * (trainingset(j,u) - sumatorias(i,j));
end
end
W = W + DeltaW;
%W = 1/norm(W) * W; %<---is it necessary? [Hertz] doesn't mention it
end
end
Could you tell me please what I am doing wrong? Values of the matrix grow really fast.
I have the same problem with oja.m
I've tried:
- Replacing eta by 1/it --->NaN
- Replacing eta by an exponential function of number of iterations --->ok, but it's not what I expected
- Uncommenting W = 1/norm(W) * W;. This actually works, but it shouldn't be necessary, or should it?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您需要较小的 eta 值。考虑您的更新规则:
如果 eta 很大,
DeltaW
可能具有很大的绝对值(即非常大,例如 100000,或非常小,例如 -111111)。下一次循环sumatorias(i,j)
将相当大,因为它是权重的函数。迭代次数越多,权重就会变得越大,最终导致溢出。You need small values of eta. Consider your update rule:
If eta is large,
DeltaW
is likely to have a large absolute value (i.e. very big, e.g. 100000, or very small, e.g. -111111). The next time around the loopsumatorias(i,j)
will be quite large, because it is a function of the weights. The more iterations you have, the larger your weights will become, eventually leading to an overflow.好的。经过几次尝试后,我成功了。
我选择了一个相对较小的eta值:0.00001,
但仍然很慢,因为没有利用Matlab优化的矩阵乘法。
我希望它可以帮助其他人不要重复同样的错误。
问候!
Ok. After several tries, I made it work.
I choose a relatively small value of eta: 0.00001
It's still slow, because of not taking advantage of matrix multiplication, which is optimized by Matlab.
I hope it helps someone else to not repeat the same mistake.
Regards!