当前位置：文江博客话题详情

神经网络总是为任何输入产生相同/相似的输出

发布于 2024-10-08 16:55:08 字数 1436 浏览 1 评论 0原文

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

画尸师 2024-10-15 16:55:08

我也遇到过类似的问题，但能够通过更改以下内容来解决：

将问题缩小到可管理的大小。我首先尝试了太多的输入，以及太多的隐藏层单元。一旦我缩小了问题的规模，我就可以看看较小问题的解决方案是否有效。这也是有效的，因为当它缩小时，计算权重的时间显着下降，所以我可以尝试许多不同的事情而无需等待。
确保你有足够的隐藏单位。这对我来说是一个主要问题。我有大约 900 个输入连接到隐藏层中的约 10 个单元。这太小了，无法快速收敛。但如果我添加额外的单位，也会变得非常慢。减少输入数量有很大帮助。
更改激活函数及其参数。我最初使用 tanh。我尝试了其他函数：sigmoid、归一化 sigmoid、高斯等。我还发现更改函数参数以使函数更陡或更浅会影响网络收敛的速度。
更改学习算法参数。尝试不同的学习率（0.01 到 0.9）。如果您的算法支持（0.1 到 0.9），也可以尝试不同的动量参数。

希望这对那些在 Google 上找到此主题的人有所帮助！

回复收藏 0 原文

掀纱窥君容 2024-10-15 16:55:08

所以我意识到这对于原始帖子来说已经非常晚了，但我遇到了这个，因为我遇到了类似的问题，并且这里发布的原因都没有涵盖我的情况的错误。

我正在研究一个简单的回归问题，但每次我训练网络时，它都会收敛到一个点，为每个输入提供相同的输出（或有时一些不同的输出）。我研究了学习率、隐藏层/节点的数量、优化算法等，但没有什么区别。即使当我查看一个极其简单的示例时，尝试预测两个不同输入 (1d) 的输出 (1d)：

import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F

class net(nn.Module):
    def __init__(self, obs_size, hidden_size):
        super(net, self).__init__()
        self.fc = nn.Linear(obs_size, hidden_size)
        self.out = nn.Linear(hidden_size, 1)

    def forward(self, obs):
        h = F.relu(self.fc(obs))
        return self.out(h)

inputs = np.array([[0.5],[0.9]])
targets = torch.tensor([3.0, 2.0], dtype=torch.float32)

network = net(1,5)
optimizer = torch.optim.Adam(network.parameters(), lr=0.001)

for i in range(10000):
    out = network(torch.tensor(inputs, dtype=torch.float32))
    loss = F.mse_loss(out, targets)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    print("Loss: %f outputs: %f, %f"%(loss.data.numpy(), out.data.numpy()[0], out.data.numpy()[1]))

但它仍然始终输出两个输入的输出平均值。事实证明，原因是我的输出和目标的尺寸不一样：目标是 Size[2]，输出是 Size[2,1]，并且出于某种原因 PyTorch 将输出广播为 Size [2,2] MSE 损失，这完全搞乱了一切。一旦我改变了：

targets = torch.tensor([3.0, 2.0], dtype=torch.float32)

它就

targets = torch.tensor([[3.0], [2.0]], dtype=torch.float32)

按预期工作了。这显然是用 PyTorch 完成的，但我怀疑其他库可能以同样的方式广播变量。

So I realise this is extremely late for the original post, but I came across this because I was having a similar problem and none of the reasons posted here cover what was wrong in my case.

I was working on a simple regression problem, but every time I trained the network it would converge to a point where it was giving me the same output (or sometimes a few different outputs) for each input. I played with the learning rate, the number of hidden layers/nodes, the optimization algorithm etc but it made no difference. Even when I looked at a ridiculously simple example, trying to predict the output (1d) of two different inputs (1d):

import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F

class net(nn.Module):
    def __init__(self, obs_size, hidden_size):
        super(net, self).__init__()
        self.fc = nn.Linear(obs_size, hidden_size)
        self.out = nn.Linear(hidden_size, 1)

    def forward(self, obs):
        h = F.relu(self.fc(obs))
        return self.out(h)

inputs = np.array([[0.5],[0.9]])
targets = torch.tensor([3.0, 2.0], dtype=torch.float32)

network = net(1,5)
optimizer = torch.optim.Adam(network.parameters(), lr=0.001)

for i in range(10000):
    out = network(torch.tensor(inputs, dtype=torch.float32))
    loss = F.mse_loss(out, targets)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    print("Loss: %f outputs: %f, %f"%(loss.data.numpy(), out.data.numpy()[0], out.data.numpy()[1]))

but STILL it was always outputting the average value of the outputs for both inputs. It turns out the reason is that the dimensions of my outputs and targets were not the same: the targets were Size[2], and the outputs were Size[2,1], and for some reason PyTorch was broadcasting the outputs to be Size[2,2] in the MSE loss, which completely messes everything up. Once I changed:

targets = torch.tensor([3.0, 2.0], dtype=torch.float32)

targets = torch.tensor([[3.0], [2.0]], dtype=torch.float32)

It worked as it should. This was obviously done with PyTorch, but I suspect maybe other libraries broadcast variables in the same way.

回复收藏 0 原文

剩一世无双 2024-10-15 16:55:08

对我来说，事情的发生与您的情况完全一样，无论训练和训练如何，神经网络的输出始终相同。层数等。

事实证明我的反向传播算法有问题。在一处不需要的地方我乘以-1。

可能还有另一个类似的问题。问题是如何调试呢？

调试步骤：

Step1 : Write the algorithm such that it can take variable number of input layers and variable number of input & output nodes.
Step2 : Reduce the hidden layers to 0. Reduce input to 2 nodes, output to 1 node.
Step3 : Now train for binary-OR-Operation.
Step4 : If it converges correctly, go to Step 8.
Step5 : If it doesn't converge, train it only for 1 training sample
Step6 : Print all the forward and prognostication variables (weights, node-outputs, deltas etc)
Step7 : Take pen&paper and calculate all the variables manually.
Step8 : Cross verify the values with algorithm.
Step9 : If you don't find any problem with 0 hidden layers. Increase hidden layer size to 1. Repeat step 5,6,7,8

听起来工作量很大，但恕我直言，它效果很好。

For me it was happening exactly like in your case, the output of neural network was always the same no matter the training & number of layers etc.

Turns out my back-propagation algorithm had a problem. At one place I was multiplying by -1 where it wasn't required.

There could be another problem like this. The question is how to debug it?

Steps to debug:

Step1 : Write the algorithm such that it can take variable number of input layers and variable number of input & output nodes.
Step2 : Reduce the hidden layers to 0. Reduce input to 2 nodes, output to 1 node.
Step3 : Now train for binary-OR-Operation.
Step4 : If it converges correctly, go to Step 8.
Step5 : If it doesn't converge, train it only for 1 training sample
Step6 : Print all the forward and prognostication variables (weights, node-outputs, deltas etc)
Step7 : Take pen&paper and calculate all the variables manually.
Step8 : Cross verify the values with algorithm.
Step9 : If you don't find any problem with 0 hidden layers. Increase hidden layer size to 1. Repeat step 5,6,7,8

It sounds like a lot of work, but it works very well IMHO.

回复收藏 0 原文

玩物 2024-10-15 16:55:08

当层数很大时，我的模型遇到了同样的问题。我使用的学习率为 0.0001。当我将学习率降低到 0.0000001 时，问题似乎解决了。我认为当学习率太低时算法会陷入局部最小值

回复收藏 0 原文

鹿港巷口少年归 2024-10-15 16:55:08

我知道，对于原来的帖子来说，这已经太晚了，但也许我可以帮助别人解决这个问题，因为我面临着同样的问题。

对我来说，问题是，我的输入数据在重要列中缺少值，而训练/测试数据并未丢失。我将这些值替换为零值，瞧，结果突然变得合理了。所以也许检查你的数据，也许它被歪曲了

回复收藏 0 原文

拔了角的鹿 2024-10-15 16:55:08

如果没有看到代码示例，很难说清楚，但对于网络来说，这是可能发生的，因为它的隐藏神经元数量。随着神经元数量和隐藏层数量的增加，不可能用少量训练数据来训练网络。在可以制作具有较小层和神经元的网络之前，使用较大的网络是错误的。因此，也许您的问题可以通过注意这一问题来解决。

回复收藏 0 原文

瀞厅☆埖开 2024-10-15 16:55:08

我还没有用问题中的 XOR 问题对其进行测试，但是对于我基于 Tic-Tac-Toe 的原始数据集，我相信我已经让网络进行了一些训练（我只运行了 1000 epoch，这还不够） )：快速传播网络可以赢得/打平超过一半的比赛；反向传播可以得到大约41%。问题归结为实现错误（小错误）以及不理解误差导数（每个权重）和每个神经元的误差之间的差异，我没有在我的研究中得到重视。 @darkcanuck 关于训练偏差类似于权重的答案可能会有所帮助，尽管我没有实现它。我还用 Python 重写了我的代码，以便我可以更轻松地使用它。因此，虽然我还没有让网络达到极小极大算法的效率，但我相信我已经成功解决了问题。

回复收藏 0 原文

眼泪也成诗 2024-10-15 16:55:08

当我的数据没有正确标准化时，我之前遇到过类似的问题。一旦我标准化了数据，一切就正常运行了。

最近，我再次遇到这个问题，经过调试，我发现神经网络给出相同输出可能还有另一个原因。如果您的神经网络具有权重衰减项（例如 RSNNS 包中的权重衰减项），请确保您的衰减项不会太大，以致所有权重基本上都变为 0。

我使用的是 < R 中的strong>caret 包。最初，我使用衰减超参数 = 0.01。当我查看诊断时，我发现正在计算每次折叠（交叉验证）的 RMSE，但 Rsquared 始终为 NA。在这种情况下，所有预测都得出相同的值。

一旦我将衰减降低到更低的值（1E-5 及更低），我就得到了预期的结果。

我希望这有帮助。

回复收藏 0 原文

九八野马 2024-10-15 16:55:08

如果没有看到代码示例，很难判断，但是偏差错误可能会产生这种影响（例如忘记将偏差添加到输入中），因此我会仔细查看代码的该部分。

回复收藏 0 原文

北斗星光 2024-10-15 16:55:08

根据您的评论，我同意@finnw 的观点，即您存在偏见问题。您应该将偏差视为每个神经元的恒定“1”（或 -1，如果您愿意）输入。每个神经元也将有自己的偏差权重，因此神经元的输出应该是加权输入的总和，加上通过激活函数传递的偏差乘以其权重。偏置权重在训练期间会像其他权重一样更新。

Fausett 的“神经网络基础知识”（第 300 页）有一个使用二进制输入的 XOR 示例以及一个具有 2 个输入、1 个由 4 个神经元组成的隐藏层和 1 个输出神经元的网络。权重在 +0.5 和 -0.5 之间随机初始化。学习率为 0.02 时，示例网络在大约 3000 个时期后收敛。如果解决了偏差问题（以及任何其他错误），您应该能够获得相同的结果。

另请注意，如果网络中没有隐藏层，则无法解决异或问题。

回复收藏 0 原文

凯凯我们等你回来 2024-10-15 16:55:08

我遇到了类似的问题，我发现这是我的权重生成方式的问题。
我正在使用：

w = numpy.random.rand(layers[i], layers[i+1])

这生成了 0 到 1 之间的随机权重。
当我使用 randn() 代替时，问题得到了解决：

w = numpy.random.randn(layers[i], layers[i+1])

这会生成负权重，这有助于我的输出变得更加多样化。

I encountered a similar issue, I found out that it was a problem with how my weights were being generated.
I was using:

w = numpy.random.rand(layers[i], layers[i+1])

This generated a random weight between 0 and 1.
The problem was solved when I used randn() instead:

w = numpy.random.randn(layers[i], layers[i+1])

This generates negative weights, which helped my outputs become more varied.

回复收藏 0 原文

那请放手 2024-10-15 16:55:08

我遇到了这个问题。我使用 nnet 预测 6 行数据和 1200 多列。

每列都会返回不同的预测，但该列中的所有行都将具有相同的值。

我通过显着增加大小参数来解决这个问题。我将其从 1-5 增加到 11+。

我还听说降低衰减率会有帮助。

回复收藏 0 原文

漫漫岁月 2024-10-15 16:55:08

我在机器学习算法方面也遇到过类似的问题，当我查看代码时，我发现随机生成器并不是真正随机的。如果您不使用新的随机种子（例如 Unix 时间），请参阅 http://en.wikipedia。 org/wiki/Unix_time），那么就有可能一遍又一遍地得到完全相同的结果。

回复收藏 0 原文

~没有更多了~

关于作者

时光沙漏

暂无简介

0 文章

0 评论

21 人气

关注发私信

友情链接

文江博客

神经网络总是为任何输入产生相同/相似的输出

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（13）

关于作者

相关话题

热门标签

推荐作者

lioqio

Single

禾厶谷欠

alipaysp_2zg8elfGgC

qq_N6d4X7

放低过去

友情链接

神经网络总是为任何输入产生相同/相似的输出

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（13）

关于作者

相关话题

热门标签

推荐作者

lioqio

Single

禾厶谷欠

alipaysp_2zg8elfGgC

qq_N6d4X7

放低过去

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。