在Pytorch中,如何单热编码这样的灰度图像以进行语义分割?
我正在使用验证的DEEPLABV3模型进行图像分割,并且它为我提供了一个带有Shape BXCXWXH的输出,其中B =批处理大小,C =类,W =宽度和H =高度。如果我采用此输出映像的深度gragmax,我会得到一个WXH结果,每个像素代表一个类。对于此输出图像,我的灰度图像作为标签,具有WXH形状。 HOWEWER,灰度标签图像中的像素值不在0到类别的范围内,而是0.0xx至0.2,因此我无法使用它来计算损失。为此,我必须单热编码标签图像,但我不知道该怎么做。
例如,标签图像具有以下值:
tensor([[[0.0824, 0.0824, 0.0824, ..., 0.0431, 0.0431, 0.0317],
[0.0824, 0.0824, 0.0824, ..., 0.0431, 0.0431, 0.0317],
[0.0824, 0.0824, 0.0824, ..., 0.0431, 0.0431, 0.0317],
...,
[0.0275, 0.0275, 0.0275, ..., 0.0275, 0.0275, 0.0317],
[0.0275, 0.0275, 0.0275, ..., 0.0275, 0.0275, 0.0317],
[0.0275, 0.0275, 0.0275, ..., 0.0275, 0.0275, 0.0317]]])
具有14152个唯一像素值。图像的大小为1024x1024。我如何单速编码此图像?
数据集是Kitti语义像素级别。
I'm using the pretrained DeeplabV3 model for image segmentation, and it gives me an output with shape BxCxWxH, where B=batch size, C=number of classes, W=Width and H=Height. If I take the depth-wise argmax of this output image, I get a WxH result, where every pixel represents a class. For this output image, I have a grayscale image as label, with WxH shape. Howewer, the pixel values in the grayscale label image are not in the range of 0 to number of classes, but in 0.0xx to 0.2, so I can't use it to calculate the loss. To do it, I have to one-hot encode the label image, but I don't know how to do it.
For example, the label image has the following values:
tensor([[[0.0824, 0.0824, 0.0824, ..., 0.0431, 0.0431, 0.0317],
[0.0824, 0.0824, 0.0824, ..., 0.0431, 0.0431, 0.0317],
[0.0824, 0.0824, 0.0824, ..., 0.0431, 0.0431, 0.0317],
...,
[0.0275, 0.0275, 0.0275, ..., 0.0275, 0.0275, 0.0317],
[0.0275, 0.0275, 0.0275, ..., 0.0275, 0.0275, 0.0317],
[0.0275, 0.0275, 0.0275, ..., 0.0275, 0.0275, 0.0317]]])
with 14152 unique pixel values. The size of the image is 1024x1024. How could I one-hot encode this image?
The Dataset is the KITTI Semantics Pixel level.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
好的,所以我最终遇到了问题:偶然地,我通过将其传递到Torchivison.Transforms实例中调整了输入接地图像,使其变成具有很多像素值的重塑张量。没有此操作,我得到了地面真相图像的普通像素值。
Okay so i got the problem finally: Accidentally, I resized the input ground-truth image by passing it to a torchivison.transforms instance, making it into a reshaped tensor with lots of pixel values. Without this operation, I got the normal pixel values for the ground truth image.