yolov5 中的非标准化标签

发布于 2025-01-16 23:00:30 字数 1892 浏览 0 评论 0原文

我正在自定义数据集上训练 yolov5 并收到非标准化标签错误。注释有 x,y 和 w,h，这意味着边界框存在于 (x,y) 到 (x+w,y+h) 之间。我正在使用 cv2 矩形函数来显示图像上的边界框，它正在创建完美的边界框。我知道我必须将原始标签转换为标准化中心 x、中心 y、宽度和高度值。我在下面这样做：

x2=x+w # x,y, w and h are given
y2=y1+h

xc=x+w/2
yc=y+h/2
xc=xc/width # normalize from 0-1. Width and height are image's width and height
yc=yc/height
  
wn=w/width # normalize the width from 0-1
hn=h/height
 
label_file.write(f"{category_idx} {xc} {yc} {wn} {hn}\n")

但是当我在文本文件中写入这些标签并运行 yolov5 训练时，它给出以下断言错误：

assert (l[:, 1:] <= 1).all(), 'non-normalized or out of bounds coordinate labels: %s' % file # throws assertion error
AssertionError: non-normalized or out of bounds coordinate labels: /Raja/Desktop/yolov5/data/roi/labels/train/10.txt

下面给出了 10.txt 文件：

1 0.7504960317460317 0.3599537037037037 0.16765873015873023 0.059193121693121686
4 0.21664186507936506 0.3316798941798942 0.19122023809523808 0.0443121693121693
5 0.47879464285714285 0.2931547619047619 0.32663690476190477 0.04728835978835977
0 0.265625 0.47701719576719576 0.3045634920634921 0.0889550264550264
1 0.17671130952380953 0.5830026455026455 0.13120039682539683 0.07275132275132279
2 0.5212053571428572 0.7986111111111112 0.15550595238095244 0.07407407407407407
2 0.7638888888888888 0.8009259259259259 0.16121031746031755 0.07275132275132279

我正在使用 cv2 矩形函数来显示边界框在图像上，它正在创建完美的边界框，如下图所示：

cv2.rectangle(temp_img,(int(x), int(y)),(int(x+w), int(y+h)),color=(0, 255, 0),thickness=2)

我尝试在网上找到解决方案，例如此问题在 GitHub 上提出，但尚未找到任何内容。谁能告诉我我在这里做错了什么？我认为将原始标签转换为 0-1 标准化标签时存在问题，因为断言指出它已找到非标准化标签。任何帮助将不胜感激！

原文

I am training yolov5 on my custom dataset and am getting the non-normalized labels' error. The annotations have x,y, and w,h which means that the bounding box is present from (x,y) to (x+w,y+h). I am using the cv2 rectangle function to display the bounding boxes on the image and it is creating the perfect bounding boxes. I understand that I have to convert my raw labels to normalized center x, center y, width, and height values. I am doing that below:

x2=x+w # x,y, w and h are given
y2=y1+h

xc=x+w/2
yc=y+h/2
xc=xc/width # normalize from 0-1. Width and height are image's width and height
yc=yc/height
  
wn=w/width # normalize the width from 0-1
hn=h/height
 
label_file.write(f"{category_idx} {xc} {yc} {wn} {hn}\n")

But when I write these labels in the text file and run the yolov5 training, it gives the following assertion error:

assert (l[:, 1:] <= 1).all(), 'non-normalized or out of bounds coordinate labels: %s' % file # throws assertion error
AssertionError: non-normalized or out of bounds coordinate labels: /Raja/Desktop/yolov5/data/roi/labels/train/10.txt

The 10.txt file is given below:

1 0.7504960317460317 0.3599537037037037 0.16765873015873023 0.059193121693121686
4 0.21664186507936506 0.3316798941798942 0.19122023809523808 0.0443121693121693
5 0.47879464285714285 0.2931547619047619 0.32663690476190477 0.04728835978835977
0 0.265625 0.47701719576719576 0.3045634920634921 0.0889550264550264
1 0.17671130952380953 0.5830026455026455 0.13120039682539683 0.07275132275132279
2 0.5212053571428572 0.7986111111111112 0.15550595238095244 0.07407407407407407
2 0.7638888888888888 0.8009259259259259 0.16121031746031755 0.07275132275132279

I am using the cv2 rectangle function to display the bounding boxes on the image and it is creating the perfect bounding boxes as displayed in the picture below:

cv2.rectangle(temp_img,(int(x), int(y)),(int(x+w), int(y+h)),color=(0, 255, 0),thickness=2)

I have tried to find the solution online like from this issue raised on GitHub but haven't found anything yet.
Can anyone please tell me what am I doing wrong here? I believe that the issue exists in converting raw labels to 0-1 normalized labels as the assertion states that it has found non-normalized labels. Any help will be highly appreciated!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

浅蓝的眸勾画不出的柔情 2025-01-23 23:00:30

YOLOv5要求数据集为darknet格式。下面是它的概述：

每个图像一个带有标签文件的 txt
每个对象一行
每行都是类 x_center y_center width height 格式。
框坐标必须采用标准化 xywh 格式（从 0 - 1）。如果您的框以像素为单位，请将 x_center 和 width 除以图像宽度，并将 y_center 和 height 除以图像高度。
类编号从零开始索引（从 0 开始）。

示例：

图像属性：宽度=1156像素，高度=1144像素。
边界框属性：xmin=1032、ymin=20、xmax=1122、ymax=54、object_name="Ring"。
令objects_list="bracelet","Earring","Ring","Necklace"

YOLOv5格式： f"{category_idx} {x1 + bbox_width / 2} {y1 + bbox_height / 2} {bbox_width} {bbox_height}\n"

$bbox_{宽度} = x_{max}/宽度 - x_{min}/宽度 = (1122-1032)/1156 = 0.07785467128027679$
$bbox_{高度} = y_{最大}/高度 - y_{最小}/高度 = (54-20)/1144 = 0.029720279720279717$
$x_{center}=x_{min}/width+bbox_{width}/2 = 0.9316608996539792$
$y_{center}=y_{min}/height + bbox_{height}/2 = 0.032342657342657344$
category_idx=2
最终结果： <强>2 0.9316608996539792 0.032342657342657344 0.07785467128027679 0.029720279720279717