如何更改蒙版的通道

发布于 2025-01-11 02:03:34 字数 2989 浏览 5 评论 0原文

我得到了如下所示的数据集的分段掩码：

我想将掩码文件更改为类似这样的文件（其中每个类都有不同的灰色阴影）。这是 1 个通道：

后一个掩码与这段代码配合得更好，但我想使用的数据集具有“彩色”掩码：

# CIHP has 20 labels and Headsegmentation has 14 labels

image_size = 512
batch = 4
labels = 20
data_directory = "/content/CIHP/instance-level_human_parsing/"
sample_train_images = len(os.listdir(data_directory + 'Training/Images/')) - 1
sample_validation_images = len(os.listdir(data_directory + 'Validation/Images/')) - 1
test_images = len(os.listdir('/content/headsegmentation_final/Test/')) - 1
print('Train size: ' + str(sample_train_images))
print('Validation size: ' + str(sample_validation_images))

t_images = sorted(glob(os.path.join(data_directory, "Training/Images/*")))[:sample_train_images]
t_masks = sorted(glob(os.path.join(data_directory, "Training/Category_ids/*")))[:sample_train_images]
v_images = sorted(glob(os.path.join(data_directory, "Validation/Images/*")))[:sample_validation_images]
v_masks = sorted(glob(os.path.join(data_directory, "Validation/Category_ids/*")))[:sample_validation_images]
test_images = sorted(glob(os.path.join(data_directory, "/content/headsegmentation_final/Test/*")))[:test_images]

def image_augmentation(img, random_range):
    img = tf.image.random_flip_left_right(img)
    img = tfa.image.rotate(img, random_range)

    return img

def image_process(path, mask=False):
    img = tf.io.read_file(path)

    upper = 90 * (math.pi/180.0) # degrees -> radian
    lower = 0 * (math.pi/180.0)
    ran_range = random.uniform(lower, upper)

    if mask == True:
        img = tf.image.decode_png(img, channels=1)
        img.set_shape([None, None, 1])
        img = tf.image.resize(images=img, size=[image_size, image_size])
        #img = image_augmentation(img, ran_range)

    else:
        img = tf.image.decode_jpeg(img, channels=3)
        img.set_shape([None, None, 3])
        img = tf.image.resize(images=img, size=[image_size, image_size])
        img = img / 127.5 - 1
        #img = image_augmentation(img, ran_range)

    return img

def data_loader(image_list, mask_list):
    img = image_process(image_list)
    mask = image_process(mask_list, mask=True)
    return img, mask

def data_generator(image_list, mask_list):

    cihp_dataset = tf.data.Dataset.from_tensor_slices((image_list, mask_list))
    cihp_dataset = cihp_dataset.map(data_loader, num_parallel_calls=tf.data.AUTOTUNE)
    cihp_dataset = cihp_dataset.batch(batch, drop_remainder=True)

    return cihp_dataset

train_dataset = data_generator(t_images, t_masks)
val_dataset = data_generator(v_images, v_masks)

print("Train Dataset:", train_dataset)
print("Val Dataset:", val_dataset)

基本上我想迭代每个“彩色”掩码文件并将其更改为后一个。我可以进行迭代，但我不知道如何转换掩码文件。

原文

I got segmentation masks for a dataset that look like this:

I want to change the mask file to something like this (where every class is a different shade of gray). This one is 1 channel:

The latter mask works better with this piece of code but the dataset I want to use, has the "colorful" masks:

# CIHP has 20 labels and Headsegmentation has 14 labels

image_size = 512
batch = 4
labels = 20
data_directory = "/content/CIHP/instance-level_human_parsing/"
sample_train_images = len(os.listdir(data_directory + 'Training/Images/')) - 1
sample_validation_images = len(os.listdir(data_directory + 'Validation/Images/')) - 1
test_images = len(os.listdir('/content/headsegmentation_final/Test/')) - 1
print('Train size: ' + str(sample_train_images))
print('Validation size: ' + str(sample_validation_images))

t_images = sorted(glob(os.path.join(data_directory, "Training/Images/*")))[:sample_train_images]
t_masks = sorted(glob(os.path.join(data_directory, "Training/Category_ids/*")))[:sample_train_images]
v_images = sorted(glob(os.path.join(data_directory, "Validation/Images/*")))[:sample_validation_images]
v_masks = sorted(glob(os.path.join(data_directory, "Validation/Category_ids/*")))[:sample_validation_images]
test_images = sorted(glob(os.path.join(data_directory, "/content/headsegmentation_final/Test/*")))[:test_images]

def image_augmentation(img, random_range):
    img = tf.image.random_flip_left_right(img)
    img = tfa.image.rotate(img, random_range)

    return img

def image_process(path, mask=False):
    img = tf.io.read_file(path)

    upper = 90 * (math.pi/180.0) # degrees -> radian
    lower = 0 * (math.pi/180.0)
    ran_range = random.uniform(lower, upper)

    if mask == True:
        img = tf.image.decode_png(img, channels=1)
        img.set_shape([None, None, 1])
        img = tf.image.resize(images=img, size=[image_size, image_size])
        #img = image_augmentation(img, ran_range)

    else:
        img = tf.image.decode_jpeg(img, channels=3)
        img.set_shape([None, None, 3])
        img = tf.image.resize(images=img, size=[image_size, image_size])
        img = img / 127.5 - 1
        #img = image_augmentation(img, ran_range)

    return img

def data_loader(image_list, mask_list):
    img = image_process(image_list)
    mask = image_process(mask_list, mask=True)
    return img, mask

def data_generator(image_list, mask_list):

    cihp_dataset = tf.data.Dataset.from_tensor_slices((image_list, mask_list))
    cihp_dataset = cihp_dataset.map(data_loader, num_parallel_calls=tf.data.AUTOTUNE)
    cihp_dataset = cihp_dataset.batch(batch, drop_remainder=True)

    return cihp_dataset

train_dataset = data_generator(t_images, t_masks)
val_dataset = data_generator(v_images, v_masks)

print("Train Dataset:", train_dataset)
print("Val Dataset:", val_dataset)

Basically I want to iterate every single "colorful" mask file and change it to the latter one. I can do the iteration but I don't know how to convert mask files.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

时光匆匆的小流年 2025-01-18 02:03:34

查看您的代码，您的标签图像似乎并不是真正的 RGB 图像（并且它使很多意义），每像素 3 个通道），而是单通道索引 RGB 图像：

if mask == True:
  img = tf.image.decode_png(img, channels=1)

当您显示标签图像时（具体您是如何做到这一点的？），您可以使用颜色图为每个标签分配特定的颜色。
但是，当您的 image_process 函数读取遮罩图像时，它不会返回 3 通道 RGB 图像，而仅返回标签索引图，您可以将其视为灰度图像。无需转换。

Looking at your code it seems like (and it makes a lot of sense) your label images are not really RGB images (i.e., 3 channels per-pixel), but rather single-channel indexed-RGB images:

if mask == True:
  img = tf.image.decode_png(img, channels=1)

When you display the label image (how exactly are you doing so?) you use a color map that assigns a specific color to each label.
However, when your image_process function reads the mask image it does not return a 3-channel RGB image, but rather only the label index map, which you can treat as a gray-scale image. No conversion is needed.

回复收藏 0 原文

~没有更多了~