优化pytorch中的多个损失函数
我正在 PyTorch 中训练一个具有不同输出的模型,并且对于位置(以米为单位)、旋转(以度为单位)和速度有四种不同的损失,以及模型必须预测的布尔值 0 或 1。
AFAIK,这里有两种定义最终损失函数的方法:
一是损失的朴素加权和
二是每个损失的定义系数以优化最终损失。
所以,我的问题是如何更好地权衡这些损失以获得最终损失,正确吗?
I am training a model with different outputs in PyTorch, and I have four different losses for positions (in meter), rotations (in degree), and velocity, and a boolean value of 0 or 1 that the model has to predict.
AFAIK, there are two ways to define a final loss function here:
one - the naive weighted sum of the losses
two - the defining coefficient for each loss to optimize the final loss.
So, My question is how is better to weigh these losses to obtain the final loss, correctly?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这不是一个关于编程的问题,而是一个关于多目标设置中的优化的问题。您描述的两个选项归结为相同的方法,即损失项的线性组合。然而,请记住,还有许多其他方法,包括动态损失加权、不确定性加权等......在实践中,最常用的方法是线性组合,其中每个目标获得通过网格搜索或确定的权重随机搜索。
您可以查找有关多任务学习的调查,其中展示了一些方法:密集预测任务的多任务学习:调查,Vandenhende 等人,T-PAMI '20。
这是一个活跃的研究领域,因此,您的问题没有明确的答案。
This is not a question about programming but instead about optimization in a multi-objective setup. The two options you've described come down to the same approach which is a linear combination of the loss term. However, keep in mind there are many other approaches out there with dynamic loss weighting, uncertainty weighting, etc... In practice, the most often used approach is the linear combination where each objective gets a weight that is determined via grid-search or random-search.
You can look up this survey on multi-task learning which showcases some approaches: Multi-Task Learning for Dense Prediction Tasks: A Survey, Vandenhende et al., T-PAMI'20.
This is an active line of research, as such, there is no definite answer to your question.
这是一个有趣的问题。正如@lvan所说,这是一个多目标优化问题。
多损失/多任务如下:
l
是total_loss,f
是类别损失函数,g
是检测损失功能。不同的损失函数具有不同的刷新率。随着学习的进行,两个损失函数下降的速率很不一致。通常,其中一个下降得非常快,而另一个则下降得非常慢。
有一篇论文专门讨论这个问题:
多任务学习使用不确定性来权衡场景几何和语义的损失
论文的主要思想是估计每个任务的不确定性,然后自动减少损失的权重。
我的母语不是英语。希望您能理解我的回答并对您有所帮助。
That's a interesting problem. As @lvan said, this is a problem of optimization in a multi-objective.
The multi-loss/multi-task is as following:
The
l
is total_loss,f
is the class loss function,g
is the detection loss function.The different loss function have the different refresh rate.As learning progresses, the rate at which the two loss functions decrease is quite inconsistent. Often one decreases very quickly and the other decreases super slowly.
There is a paper devoted to this question:
Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics
The main thinking of th paper estimate the uncertainty of each task, then automatically reducing the weight of the loss.
I am a non-native English speaker. Hope you can understand my answer and help you.