关于第三点,找出 ANN 何时开始过度拟合的一种方法是绘制训练数据和测试数据上网络的准确性与执行的 epoch 数的关系图。 在某些时候,随着您的训练准确度持续提高(趋向 100%),您的测试准确度实际上可能会开始下降,因为 ANN 对训练数据过度拟合。 看看什么时期开始发生,并确保不要训练超过那个时期。
Regarding number 3, one way to find out when your ANN starts to overfit is by graphing the accuracy of the net on your training data and your test data vs the number of epochs performed. At some point, as your training accuracy continues to increase (tending towards 100%), your test accuracy will probably start to actually decrease because the ANN is overfitting to the training data. See what epoch that starts to happen and make sure not to train past that.
If your data is very regular and consistent, then it might not overfit until very late in the game, or not at all. And if your data is highly irregular, then your ANN will start to overfit much earlier.
Also, a way to test how regular your data is is to do something like k-fold cross validation.
Most of these questions are things that you need to try different options to see what works best. That is the problem with ANNs. There is no "best" way to do almost anything. You need to find out what works for your specific problem. Nevertheless, I will give my advice for your questions.
1) I prefer incremental learning. I think it is important for the network weights to be updated after each pattern.
2) This is a tough question. It really depends on the complexity of your network. How many input nodes, output nodes, and training patterns that there are. For your problem, I might start with 100 and try ranges up and down from 100 to see if there is improvement.
3) I usually calculate the total error of the network when applied to the test set (not the training set) after each epoch. If that error increases for about 5 epochs, I will stop training and then use the network that was created before the increase occurred. It is important not to use the error of the training set when deciding to stop training. This is what will cause overfitting.
4) You could also try a probabilistic neural network if you are representing your output as 26 nodes, each representing a letter of the alphabet. This network architecture is good for classification problems. Again, it may be a good idea just to try a few different architectures to see what works best for your problem.
发布评论
评论(2)
关于第三点,找出 ANN 何时开始过度拟合的一种方法是绘制训练数据和测试数据上网络的准确性与执行的 epoch 数的关系图。 在某些时候,随着您的训练准确度持续提高(趋向 100%),您的测试准确度实际上可能会开始下降,因为 ANN 对训练数据过度拟合。 看看什么时期开始发生,并确保不要训练超过那个时期。
如果您的数据非常规则且一致,那么直到游戏的最后阶段它可能都不会过拟合,或者根本不会过拟合。 如果您的数据高度不规则,那么您的人工神经网络将更早开始过度拟合。
另外,测试数据规律性的一种方法是执行类似 k 倍交叉验证。
Regarding number 3, one way to find out when your ANN starts to overfit is by graphing the accuracy of the net on your training data and your test data vs the number of epochs performed. At some point, as your training accuracy continues to increase (tending towards 100%), your test accuracy will probably start to actually decrease because the ANN is overfitting to the training data. See what epoch that starts to happen and make sure not to train past that.
If your data is very regular and consistent, then it might not overfit until very late in the game, or not at all. And if your data is highly irregular, then your ANN will start to overfit much earlier.
Also, a way to test how regular your data is is to do something like k-fold cross validation.
这些问题中的大多数都是您需要尝试不同选项才能找到最有效的选项。 这就是人工神经网络的问题。 几乎任何事情都没有“最佳”方法。 您需要找出适合您的具体问题的方法。 尽管如此,我还是会针对你的问题给出我的建议。
1)我更喜欢增量学习。 我认为在每个模式之后更新网络权重很重要。
2)这是一个很难回答的问题。 这实际上取决于网络的复杂性。 有多少个输入节点、输出节点和训练模式。 对于你的问题,我可能会从 100 开始,尝试从 100 上下调整,看看是否有改善。
3)我通常计算每个时期后应用于测试集(而不是训练集)时网络的总误差。 如果该错误增加了大约 5 个时期,我将停止训练,然后使用增加发生之前创建的网络。 在决定停止训练时,不要使用训练集的误差,这一点很重要。 这就是会导致过拟合的原因。
4) 如果您将输出表示为 26 个节点,每个节点代表字母表中的一个字母,您也可以尝试概率神经网络。 这种网络架构非常适合分类问题。 同样,尝试几种不同的架构来看看哪种架构最适合您的问题可能是个好主意。
Most of these questions are things that you need to try different options to see what works best. That is the problem with ANNs. There is no "best" way to do almost anything. You need to find out what works for your specific problem. Nevertheless, I will give my advice for your questions.
1) I prefer incremental learning. I think it is important for the network weights to be updated after each pattern.
2) This is a tough question. It really depends on the complexity of your network. How many input nodes, output nodes, and training patterns that there are. For your problem, I might start with 100 and try ranges up and down from 100 to see if there is improvement.
3) I usually calculate the total error of the network when applied to the test set (not the training set) after each epoch. If that error increases for about 5 epochs, I will stop training and then use the network that was created before the increase occurred. It is important not to use the error of the training set when deciding to stop training. This is what will cause overfitting.
4) You could also try a probabilistic neural network if you are representing your output as 26 nodes, each representing a letter of the alphabet. This network architecture is good for classification problems. Again, it may be a good idea just to try a few different architectures to see what works best for your problem.