当前位置：文江博客话题详情

为什么在张量流中分类横熵的结果与定义不同？

发布于 2025-01-29 01:23:31 字数 1479 浏览 3 评论 0 原文

我正在测试 tf.keras.losses.CategoricalCrossentRopy 的结果，它为我提供了与定义不同的值。我对跨熵的理解是：


def ce_loss_def(y_true, y_pred):
    return tf.reduce_sum(-tf.math.multiply(y_true, tf.math.log(y_pred)))

我有这样的价值：

pred = [0.1, 0.1, 0.1, 0.7]
target = [0, 0, 0, 1]
pred = tf.constant(pred, dtype = tf.float32)
target = tf.constant(target, dtype = tf.float32)

pred_2 = [0.1, 0.3, 0.1, 0.7]
target = [0, 0, 0, 1]
pred_2 = tf.constant(pred_2, dtype = tf.float32)
target = tf.constant(target, dtype = tf.float32)

根据定义，我认为它应该无视非目标类中的概率，例如：

ce_loss_def(y_true = target, y_pred = pred), ce_loss_def(y_true = target, y_pred = pred_2)

(<tf.Tensor: shape=(), dtype=float32, numpy=0.35667497>,
 <tf.Tensor: shape=(), dtype=float32, numpy=0.35667497>)

但是 tf.keras.losses.losses.categoricalcrossentropy 不给我相同的结果：

ce_loss_keras = tf.keras.losses.CategoricalCrossentropy()

ce_loss_keras(y_true = target, y_pred = pred), ce_loss_keras(y_true = target, y_pred = pred_2)

输出：

(<tf.Tensor: shape=(), dtype=float32, numpy=0.35667497>,
 <tf.Tensor: shape=(), dtype=float32, numpy=0.5389965>)

我缺少什么？

这是我用来获得此结果的笔记本的链接：

原文

I am testing outcomes of tf.keras.losses.CategoricalCrossEntropy, and it gives me values different from the definition.
My understanding of cross entropy is:


def ce_loss_def(y_true, y_pred):
    return tf.reduce_sum(-tf.math.multiply(y_true, tf.math.log(y_pred)))

And lets say I have values like this:

pred = [0.1, 0.1, 0.1, 0.7]
target = [0, 0, 0, 1]
pred = tf.constant(pred, dtype = tf.float32)
target = tf.constant(target, dtype = tf.float32)

pred_2 = [0.1, 0.3, 0.1, 0.7]
target = [0, 0, 0, 1]
pred_2 = tf.constant(pred_2, dtype = tf.float32)
target = tf.constant(target, dtype = tf.float32)

By the definition I think it should disregard the probabilities in the non-target classes, like this:

ce_loss_def(y_true = target, y_pred = pred), ce_loss_def(y_true = target, y_pred = pred_2)

(<tf.Tensor: shape=(), dtype=float32, numpy=0.35667497>,
 <tf.Tensor: shape=(), dtype=float32, numpy=0.35667497>)

But tf.keras.losses.CategoricalCrossEntropy doesn't give me the same results:

ce_loss_keras = tf.keras.losses.CategoricalCrossentropy()

ce_loss_keras(y_true = target, y_pred = pred), ce_loss_keras(y_true = target, y_pred = pred_2)

outputs:

(<tf.Tensor: shape=(), dtype=float32, numpy=0.35667497>,
 <tf.Tensor: shape=(), dtype=float32, numpy=0.5389965>)

What am I missing?

Here is the link to the notebook I used to get this result:
https://colab.research.google.com/drive/1T69vn7MCGMSQ8hlRkyve6_EPxIZC1IKb#scrollTo=dHZruq-PGyzO

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

疯到世界奔溃 2025-02-05 01:23:31

我发现问题是什么。矢量元素会以某种方式自动缩放，以总结为1，因为这些值是概率。

import tensorflow as tf

ce_loss = tf.keras.losses.CategoricalCrossentropy()

pred = [0.05, 0.2, 0.25, 0.5]
target = [0, 0, 0, 1]
pred = tf.constant(pred, dtype = tf.float32)
target = tf.constant(target, dtype = tf.float32)

pred_2 = [0.1, 0.3, 0.1, 0.5] # pred_2 has P(class2) = 0.3, instead of P(class2) = 0.1.
target = [0, 0, 0, 1]
pred_2 = tf.constant(pred_2, dtype = tf.float32)
target = tf.constant(target, dtype = tf.float32)

c1, c2 = ce_loss(y_true = target, y_pred = pred), ce_loss(y_true = target, y_pred = pred_2)
print("CE loss at dafault value: {}. CE loss with different probability of non-target classes:{}".format(c1,c2))

给出


CE loss at default value: 0.6931471824645996. 
CE loss with with different probability of non-target classes:0.6931471824645996

预期的。

I found out what the problem was. The vector elements get scaled automatically somehow, to sum up to 1 because the values are probabilities.

import tensorflow as tf

ce_loss = tf.keras.losses.CategoricalCrossentropy()

pred = [0.05, 0.2, 0.25, 0.5]
target = [0, 0, 0, 1]
pred = tf.constant(pred, dtype = tf.float32)
target = tf.constant(target, dtype = tf.float32)

pred_2 = [0.1, 0.3, 0.1, 0.5] # pred_2 has P(class2) = 0.3, instead of P(class2) = 0.1.
target = [0, 0, 0, 1]
pred_2 = tf.constant(pred_2, dtype = tf.float32)
target = tf.constant(target, dtype = tf.float32)

c1, c2 = ce_loss(y_true = target, y_pred = pred), ce_loss(y_true = target, y_pred = pred_2)
print("CE loss at dafault value: {}. CE loss with different probability of non-target classes:{}".format(c1,c2))

gives


CE loss at default value: 0.6931471824645996. 
CE loss with with different probability of non-target classes:0.6931471824645996

As intended.

回复收藏 0 原文

~没有更多了~

关于作者

流年里的时光

暂无简介

文章

26 人气

关注发私信

十二

文章 0 评论 0

关注

飞烟轻若梦

文章 0 评论 0

关注

OPleyuhuo

文章 0 评论 0

关注

wxb0109

文章 0 评论 0

关注

旧城空念

文章 0 评论 0

关注

-小熊_

文章 0 评论 0

友情链接

文江博客

为什么在张量流中分类横熵的结果与定义不同？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

十二

飞烟轻若梦

OPleyuhuo

wxb0109

旧城空念

-小熊_

友情链接

为什么在张量流中分类横熵的结果与定义不同？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

十二

飞烟轻若梦

OPleyuhuo

wxb0109

旧城空念

-小熊_

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。