多层神经网络不会预测负值

发布于 2024-10-19 04:46:28 字数 339 浏览 9 评论 0原文

我已经实现了一个多层感知器来预测输入向量的正弦。这些向量由随机选择的四个 -1,0,1 组成,偏差设置为 1。网络应该预测向量内容之和的 sin。

例如输入 = <0,1,-1,0,1>输出 = Sin(0+1+(-1)+0+1)

我遇到的问题是网络永远不会预测负值,并且许多向量的 sin 值都是负值。它完美地预测所有正或零输出。我假设更新权重存在问题,权重在每个纪元后都会更新。有人在 NN 上遇到过这个问题吗?任何帮助都会很棒!

注意:该网络有 5 个输入,1 个隐藏层和 1 个输出中有 6 个隐藏单元。我在激活隐藏层和输出层上使用 sigmoid 函数,并尝试了大量的学习率(当前为 0.1);

I have implemented a multilayer perceptron to predict the sin of input vectors. The vectors consist of four -1,0,1's chosen at random and a bias set to 1. The network should predict the sin of sum of the vectors contents.

eg Input = <0,1,-1,0,1> Output = Sin(0+1+(-1)+0+1)

The problem I am having is that the network will never predict a negative value and many of the vectors' sin values are negative. It predicts all positive or zero outputs perfectly. I am presuming that there is a problem with updating the weights, which are updated after every epoch. Has anyone encountered this problem with NN's before? Any help at all would be great!!

note: The network has 5inputs,6hidden units in 1 hidden layer and 1 output.I am using a sigmoid function on the activations hidden and output layers, and have tried tonnes of learning rates (currently 0.1);

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

梦回旧景 2024-10-26 04:46:28

自从我研究多层感知器以来已经很长时间了,因此对此持保留态度。

我会将您的问题域重新调整为 [0,1] 域而不是 [-1,1]。如果您看一下逻辑函数图:

在此处输入图像描述

它会生成 [0,1] 之间的值。我预计它不会产生负面结果。我可能是错的,强硬的。

编辑

您实际上可以将逻辑功能扩展到您的问题域。使用广义逻辑曲线将 A 和 K 参数设置为域的边界。

另一个选项是双曲正切,它从 [-1,+1] 开始,并且无需设置常数。

Being a long time since I looked into multilayer perceptrons hence take this with a grain of salt.

I'd rescale your problem domain to the [0,1] domain instead of [-1,1]. If you take a look at the logistic function graph:

enter image description here

It generates values between [0,1]. I do not expect it to produce negative results. I might be wrong, tough.

EDIT:

You can actually extend the logistic function to your problem domain. Use the generalized logistic curve setting A and K parameters to the boundaries of your domain.

Another option is the hyperbolic tangent, which goes from [-1,+1] and has no constants to set up.

若沐 2024-10-26 04:46:28

有许多不同类型的激活函数,其中许多被设计为输出 0 到 1 之间的值。如果您使用的函数仅输出 0 到 1 之间的值,请尝试调整它,使其输出 1 到 -1 之间。如果您使用 FANN 我会告诉您使用 FANN_SIGMOID_SYMMETRIC 激活函数。

There are many different kinds of activation functions, many of which are designed to output a value from 0 to 1. If you're using a function that only outputs between 0 and 1, try adjusting it so that it outputs between 1 and -1. If you were using FANN I would tell you to use the FANN_SIGMOID_SYMMETRIC activation function.

清风挽心 2024-10-26 04:46:28

虽然问题已经有了答案,但请允许我分享一下我的经验。我一直在尝试使用 1--4--1 神经网络来近似正弦函数。 IE,
输入图片此处描述
与您的情况类似,我不允许使用任何高级 API,例如 TensorFlow。而且我一定会使用 C++ 而不是 Python3! (顺便说一句,我最喜欢 C++)。

我使用 Sigmoid 激活及其导数定义为:

double sigmoid(double x) 
{ 
   return 1.0f / (1.0f + exp(-x)); 
}

double Sigmoid_derivative(double x)
{
   return x * (1.0f - x);
}

这是我在 10,000 epoch 后得到的结果,在 20 个训练示例上训练网络。

,网络感觉不像负曲线。因此,我将激活函数更改为 Tanh

double tanh(double x)
{
   return (exp(x)-exp(-x))/(exp(x)+exp(-x));
}

double tanh_derivative(double x)
{
   return 1.0f - x*x ;
}

令人惊讶的是,经过一半的时期(即5000)后,我得到了更好的曲线。
输入图片这里的描述我们都知道,使用更多的隐藏神经元、更多的纪元和更好(和更多)的训练示例将会显着改进。另外,整理数据也很重要!

Although the question has already been answered, allow me to share my experience. I have been trying to approximate Sine function using a 1--4--1 neural network. i.e,
enter image description here
And similar to your case, I am not allowed to use any high level API like TensorFlow. Moreover I am bound to use C++ over Python3! (BTW, I mostly prefer C++).

I used Sigmoid activation and its derivative defined as:

double sigmoid(double x) 
{ 
   return 1.0f / (1.0f + exp(-x)); 
}

double Sigmoid_derivative(double x)
{
   return x * (1.0f - x);
}

And this is what I got after 10,000 epochs, training the network on 20 Training Examples.
enter image description here

As, you can see, the network didn't feel like the negative curve. So, I changed the activation function to Tanh.

double tanh(double x)
{
   return (exp(x)-exp(-x))/(exp(x)+exp(-x));
}

double tanh_derivative(double x)
{
   return 1.0f - x*x ;
}

And surprisingly, after half the epochs, (i.e., 5000), I got a far better curve.
enter image description hereAnd we all know that it will significantly improve on using more hidden neurons, more epochs and better (and more) training example. Also, shuffling the data is important too!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文