交叉熵误差函数在普通反向传播算法中如何工作?
我正在 C++ 中研究前馈反向传播网络,但似乎无法使其正常工作。我所基于的网络使用交叉熵误差函数。然而,我对它不是很熟悉,尽管我试图查找它,但我仍然不确定。有时看起来很容易,有时却很困难。该网络将解决多项分类问题,据我了解,交叉熵误差函数适用于这些情况。 有人知道它是如何工作的吗?
I'm working on a feed-forward backpropagation network in C++ but cannot seem to make it work properly. The network I'm basing mine on is using the cross-entropy error function. However, I'm not very familiar with it and even though I'm trying to look it up I'm still not sure. Sometimes it seems easy, sometimes difficult. The network will solve a multinomial classification problem and as far as I understand, the cross-entropy error function is suitable for these cases.
Someone that knows how it works?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
啊,是的,很好的反向传播。它的乐趣在于,使用什么误差函数(从实现角度来看)并不重要,只要它是可微分的即可。一旦您知道如何计算每个输出单元的交叉熵(请参阅wiki 文章) ,您只需对该函数求偏导数即可找到隐藏层的权重,并再次找到输入层的权重。
但是,如果您的问题不是关于实施,而是关于培训困难,那么您的工作就已经完成了。不同的误差函数擅长不同的事情(最好根据误差函数的定义来推理),并且这个问题因学习率等其他参数而变得更加复杂。
希望有帮助,如果您需要任何其他信息,请告诉我;你的问题有点模糊......
Ah yes, good 'ole backpropagation. The joy of it is that it doesn't really matter (implementation wise) what error function you use, so long as it differentiable. Once you know how to calculate the cross entropy for each output unit (see the wiki article), you simply take the partial derivative of that function to find the weights for the hidden layer, and once again for the input layer.
However, if your question isn't about implementation, but rather about training difficulties, then you have your work cut out for you. Different error functions are good at different things (best to just reason it out based on the error function's definition) and this problem is compounded by other parameters like learning rates.
Hope that helps, let me know if you need any other info; your question was a lil vague...