图像在训练期间旋转

发布于 2025-02-09 05:14:06 字数 1251 浏览 1 评论 0原文

我正在尝试训练A ssd_mobilenet_v2_keras 以在或多或少6000张图像的数据集上进行对象检测。问题是图像在训练过程中随机旋转（或者至少是张量板的外观）。这是我在pipeline.config文件中使用的配置：

train_config {
  batch_size: 32
  data_augmentation_options {
    random_horizontal_flip {
    }
  }

  data_augmentation_options {
    random_rgb_to_gray {
        probability: 0.25
    }
  }

  data_augmentation_options {
    random_jpeg_quality {
        random_coef: 0.8
        min_jpeg_quality: 50
        max_jpeg_quality: 100
    }
  }

  sync_replicas: true
  optimizer {
    adam_optimizer: {
      epsilon: 1e-7
      learning_rate: {
        cosine_decay_learning_rate {
          learning_rate_base: 1e-3
          total_steps: 50000
          warmup_learning_rate: 2.5e-4
          warmup_steps: 5000
        }
      }
    }
    use_moving_average: false
  }
  fine_tune_checkpoint: "pre-trained-models/ssd_mobilenet_v2_320x320_coco17_tpu-8/checkpoint/ckpt-0"
  num_steps: 50000
  startup_delay_steps: 0.0
  replicas_to_aggregate: 8
  max_number_of_boxes: 100
  unpad_groundtruth_tensors: false
  fine_tune_checkpoint_type: "detection"
  fine_tune_checkpoint_version: V2
}

我还试图删除随机的水平翻转（我知道那可能没有解决任何问题，我只是尝试过...查看一些训练图像在张板中旋转，而且如果我进行评估，有时会旋转图像。当然，带有边界盒坐标的XML不是“旋转”的't旋转...）

原文

I am trying to train a ssd_mobilenet_v2_keras for object detection on a dataset of more or less 6000 images. The problem is that images are rotated randomly during training (or at least, this is what it looks like from the tensorboard). This is the configuration I am using in the pipeline.config file:

train_config {
  batch_size: 32
  data_augmentation_options {
    random_horizontal_flip {
    }
  }

  data_augmentation_options {
    random_rgb_to_gray {
        probability: 0.25
    }
  }

  data_augmentation_options {
    random_jpeg_quality {
        random_coef: 0.8
        min_jpeg_quality: 50
        max_jpeg_quality: 100
    }
  }

  sync_replicas: true
  optimizer {
    adam_optimizer: {
      epsilon: 1e-7
      learning_rate: {
        cosine_decay_learning_rate {
          learning_rate_base: 1e-3
          total_steps: 50000
          warmup_learning_rate: 2.5e-4
          warmup_steps: 5000
        }
      }
    }
    use_moving_average: false
  }
  fine_tune_checkpoint: "pre-trained-models/ssd_mobilenet_v2_320x320_coco17_tpu-8/checkpoint/ckpt-0"
  num_steps: 50000
  startup_delay_steps: 0.0
  replicas_to_aggregate: 8
  max_number_of_boxes: 100
  unpad_groundtruth_tensors: false
  fine_tune_checkpoint_type: "detection"
  fine_tune_checkpoint_version: V2
}

I have also tried to remove the random horizontal flip (I knew that was probably not solve anything, I just gave it a try...) but nothing changes, I still see some training images rotated in the tensorboard, and also if I run the evaluation sometimes the images are rotated. Of course the xml with the bounding box coordinates is not "rotated" so the ground truth image in tensorboard appear completely wrong, the object is in a position and the ground truth box is in a completely different position (the right position if the image wasn't rotated...)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

花之痕靓丽 2025-02-16 05:14:06

这可能很早就应该了，但是我在这个问题上也有同样的问题。我花了几个星期才弄清楚。我只想分享我的解决方案。也很抱歉，格式不好，我仍在学习。

显然，EXIF取向元数据和计算机/笔记本电脑的读取方式与智能手机/数码相机的读取方式不同。您可以更多地阅读在这里

可以首先检查图像的方向。

# Determine image orientation using Exif metadata
    IMAGE_PATHS = path/to/your/images
    image_dir = Path(IMAGE_PATHS).glob('*.jpg')

    for image_path in image_dir:
        # Check condition of image
        img = Image.open(image_path)
    
        # Check if the image has Exif metadata (not all images do)
        if hasattr(img, '_getexif'):
            # Get the Exif metadata
            exif_data = img._getexif()
            
            # Check if the Exif data contains the orientation tag (tag number 0x0112)
            if 0x0112 in exif_data:
                # Read the orientation value
                orientation = exif_data[0x0112]
                
                # Determine the orientation based on the value
                if orientation == 1:
                    print("Normal (0 degrees)")
                elif orientation == 3:
                    print("Upside down (180 degrees)")
                elif orientation == 6:
                    print("Rotated 90 degrees clockwise")
                elif orientation == 8:
                    print("Rotated 90 degrees counterclockwise")
                # Add more cases as needed for other orientation values
            else:
                print("No orientation tag found in Exif data.")
        else:
            print("No Exif data found in the image.")

如果您碰巧收到了“正常”之外的其他结果，则图像是高度旋转的。您可能不会在屏幕上看到它，但是是。这是因为Windows或任何其他操作系统自动将您的映像自动旋转以易于视图。

更正EXIF方向元数据

import cv2
image_dir = 'path/to/your/image' #folder must have image only
file_list = os.listdir(image_dir)
for i in range(0,len(file_list)):
    file_name = file_list[i]
    #print(file_name)
    save_path = image_dir + file_name
        
    image_np = load_image_into_numpy_array(save_path)
    
    # For a portrait/vertical image, the width (shape[0]) < height (shape[1]), 
    # rotate the image to overwrite the Exif metadata
    if image_np.shape[0] < image_np.shape[1]:
        image_np = np.rot90(image_np,3).copy()
    else:
        continue
    cv2.imwrite(image_dir + '/' + str(file_name), cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB))

，然后可以再次运行训练，并在Tensorboard的“ Train_input_image”上查看，以验证您的新培训图像是否正确。希望这会有所帮助。

This may long overdue but I had the same problem on this & it took me weeks to figure it out. I just want to share my solution. Also sorry for bad formatting im still learning.

Apparently, there's this issue with Exif orientation metadata and the way computer/laptop reads it differently than smartphone/digital camera. You can read it more here

So, to check on this you can first check the orientation of the image.

# Determine image orientation using Exif metadata
    IMAGE_PATHS = path/to/your/images
    image_dir = Path(IMAGE_PATHS).glob('*.jpg')

    for image_path in image_dir:
        # Check condition of image
        img = Image.open(image_path)
    
        # Check if the image has Exif metadata (not all images do)
        if hasattr(img, '_getexif'):
            # Get the Exif metadata
            exif_data = img._getexif()
            
            # Check if the Exif data contains the orientation tag (tag number 0x0112)
            if 0x0112 in exif_data:
                # Read the orientation value
                orientation = exif_data[0x0112]
                
                # Determine the orientation based on the value
                if orientation == 1:
                    print("Normal (0 degrees)")
                elif orientation == 3:
                    print("Upside down (180 degrees)")
                elif orientation == 6:
                    print("Rotated 90 degrees clockwise")
                elif orientation == 8:
                    print("Rotated 90 degrees counterclockwise")
                # Add more cases as needed for other orientation values
            else:
                print("No orientation tag found in Exif data.")
        else:
            print("No Exif data found in the image.")

If you happen to receive other result besides 'Normal', your images are highly rotated. You may not see it in your screen but it is. This is because Windows or any other OS auto-rotate your image for ease of view.

Correct the Exif orientation metadata

import cv2
image_dir = 'path/to/your/image' #folder must have image only
file_list = os.listdir(image_dir)
for i in range(0,len(file_list)):
    file_name = file_list[i]
    #print(file_name)
    save_path = image_dir + file_name
        
    image_np = load_image_into_numpy_array(save_path)
    
    # For a portrait/vertical image, the width (shape[0]) < height (shape[1]), 
    # rotate the image to overwrite the Exif metadata
    if image_np.shape[0] < image_np.shape[1]:
        image_np = np.rot90(image_np,3).copy()
    else:
        continue
    cv2.imwrite(image_dir + '/' + str(file_name), cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB))

From there, you can then run training again and see at tensorboard's 'train_input_image' to verify that your new training images are correct. Hope this helps.

回复收藏 0 原文

~没有更多了~