用于快速训练的神经网络设置

发布于 2024-09-02 14:27:05 字数 592 浏览 9 评论 0原文

我正在创建一个工具,用于根据过去的数据预测软件项目的时间和成本。该工具使用神经网络来做到这一点,到目前为止,结果是有希望的,但我认为我可以通过改变网络的属性来进行更多的优化。在这些设置方面似乎没有任何规则,甚至没有很多最佳实践,因此如果有经验的人可以帮助我,我将不胜感激。

输入数据由一系列整数组成,这些整数可以达到用户想要的最高值,但我认为大多数都会低于 100,000。有些会低至 1。它们是项目的人数和项目成本等详细信息,以及有关数据库实体和用例的详细信息。

总共有 10 个输入和 2 个输出(时间和成本)。我正在使用弹性传播来训练网络。目前它有:10 个输入节点、1 个包含 5 个节点的隐藏层和 2 个输出节点。我正在训练将错误率控制在 5% 以下。

该算法必须在网络服务器上运行,因此我采取了一种措施,当它看起来不会去任何地方时停止训练。这设置为 10,000 次训练迭代。

目前,当我尝试使用一些有点不同但完全在我们期望用户输入的数据范围内的数据来训练它时,训练需要很长时间,一遍又一遍地达到 10,000 次迭代限制。

这是我第一次使用神经网络,我真的不知道会发生什么。如果您能给我一些关于我应该为网络和迭代限制使用哪种设置的提示,我将不胜感激。

谢谢你!

I am creating a tool for predicting the time and cost of software projects based on past data. The tool uses a neural network to do this and so far, the results are promising, but I think I can do a lot more optimisation just by changing the properties of the network. There don't seem to be any rules or even many best-practices when it comes to these settings so if anyone with experience could help me I would greatly appreciate it.

The input data is made up of a series of integers that could go up as high as the user wants to go, but most will be under 100,000 I would have thought. Some will be as low as 1. They are details like number of people on a project and the cost of a project, as well as details about database entities and use cases.

There are 10 inputs in total and 2 outputs (the time and cost). I am using Resilient Propagation to train the network. Currently it has: 10 input nodes, 1 hidden layer with 5 nodes and 2 output nodes. I am training to get under a 5% error rate.

The algorithm must run on a webserver so I have put in a measure to stop training when it looks like it isn't going anywhere. This is set to 10,000 training iterations.

Currently, when I try to train it with some data that is a bit varied, but well within the limits of what we expect users to put into it, it takes a long time to train, hitting the 10,000 iteration limit over and over again.

This is the first time I have used a neural network and I don't really know what to expect. If you could give me some hints on what sort of settings I should be using for the network and for the iteration limit I would greatly appreciate it.

Thank you!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

飘逸的'云 2024-09-09 14:27:05

首先,感谢您提供这么多有关您的网络的信息!这里有一些提示,可以让您更清楚地了解情况。

  • 您需要标准化您的输入。如果一个节点看到的平均值为 100,000,而另一个节点的平均值仅为 0.5,那么您将不会看到这两个输入产生相同的影响。这就是为什么您需要将它们标准化。
  • 10 个输入节点只有 5 个隐藏神经元?我记得在某处读过,你需要至少两倍的输入数量;尝试 20 多个隐藏神经元。这将为您的神经网络模型提供开发更复杂模型的能力。然而,太多的神经元和你的网络只会记住训练数据集。
  • 弹性反向传播很好。请记住,还有其他训练算法,例如 Levenberg-Marquardt。
  • 您有多少个训练集?神经网络通常需要大型数据集才能擅长做出有用的预测。
  • 如果您还没有这样做,请考虑在您的重量训练算法中添加动量因子以加快速度。
  • 在线训练往往比批量训练更适合做出广义预测。前者在通过网络运行每个训练集后更新权重,而后者在每个数据集通过后更新网络。这是你的决定。
  • 您的数据是离散的还是连续的?神经网络在使用 01 时往往比连续函数做得更好。如果是前者,我建议使用 sigmoid 激活函数。隐藏层和输出层的 tanh 和线性激活函数的组合往往可以很好地处理连续变化的数据。
  • 您需要另一个隐藏层吗?如果您的网络正在处理复杂的输入输出表面映射,这可能会有所帮助。

First of all, thanks for providing so much information about your network! Here are a few pointers that should give you a clearer picture.

  • You need to normalize your inputs. If one node sees a mean value of 100,000 and another just 0.5, you won't see an equal impact from the two inputs. Which is why you'll need to normalize them.
  • Only 5 hidden neurons for 10 input nodes? I remember reading somewhere that you need at least double the number of inputs; try 20+ hidden neurons. This will provide your neural network model the capability to develop a more complex model. However, too many neurons and your network will just memorize the training data set.
  • Resilient backpropagation is fine. Just remember that there are other training algorithms out there like Levenberg-Marquardt.
  • How many training sets do you have? Neural networks usually need a large dataset to be good at making useful predictions.
  • Consider adding a momentum factor to your weight-training algorithm to speed things up if you haven't done so already.
  • Online training tends to be better for making generalized predictions than batch training. The former updates weights after running every training set through the network, while the latter updates the network after passing every data set through. It's your call.
  • Is your data discrete or continuous? Neural networks tend to do a better job with 0s and 1s than continuous functions. If it is the former, I'd recommend using the sigmoid activation function. A combination of tanh and linear activation functions for the hidden and output layers tend to do a good job with continuously-varying data.
  • Do you need another hidden layer? It may help if your network is dealing with complex input-output surface mapping.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文