裁剪图像时如何转换Yolo标签的协调?
我创建了超过1200张带有Yolo检测标签的图像,问题是每个图像大小均为800x600,所有带有标签的对象都位于图像的中间。因此,我想裁剪其余部分,因为物体放在中间。 因此,图像的大小将大约是400x300(左,右,顶部,底部),但这些物体仍然位于中间。但是,除了再次标记以外,您如何转换或更改坐标呢?
# (used labelimg for yolo)
0 0.545000 0.722500 0.042500 0.091667
1 0.518750 0.762500 0.097500 0.271667
这是我的标签.txt之一。对不起,我的英语不好!
i've created over 1200 images with labels for yolo detection and the problem is every image size is 800x600 and all the objects with labels are in the middle of the image. so i wanna crop the rest of the part since objects are placed in the middle.
so the size of images would be something like 400x300 (crop left, right, top, bottom equally) but the objects will still be in the middle. but how do you convert or change the coordinates other than labeling all over again?
# (used labelimg for yolo)
0 0.545000 0.722500 0.042500 0.091667
1 0.518750 0.762500 0.097500 0.271667
heres one of my label .txt. sorry for my bad english!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我自己只是自己解决这个问题,所以这是一个完整的解释,说明为什么底部的公式是正确的。
让我们介绍如何格式化这些注释。
每行都是5个数字,由一个空间溅出:
nxyw h
,W和H的H归一化高度是指原始图像的宽度和高度。
归一化值相对于图像的宽度或高度。这是一个比例。例如,x值像x [px]/w [px] = x归一化一样归一化。
这样的一些优点:
Y轴从上到下。其他一切都像您的标准坐标系。
现在要种植。让我们拍摄一棵树的照片:
缩放
我们现在将裁剪到树图像的左上方。
我们的新图像宽度W'现在仅是原始W的一半。也是H'= 0.5*h。旧图像的中心现在是左下角。我们知道图像的中心 p 在(0.5,0.5)。左下角在p'=(1,1)。如果我们在旧图像中裁剪(0.3,0.3)是新的底部,新坐标也将在(1,1)。 0.5也是½。为了从0.5到1,我们需要乘以2,为⅓ *3,¼ *4。我们看到,如果我们需要将宽度或高度降低为a/b,则需要乘以b/a。
翻译,
但我们也想移动图像的左上角,即我们的坐标Origin o 。
让裁剪到树干:
W是7个字符。新的宽度为W'是3。H= 5,H'是2。 2,3)在字符中,标准化为原始图像([![2 of 7] [2]] [2],[![3 of 5] [3]] [3] [3])或(0.285,0.6)。
o'是(0.285,0.6),但应为(0,0),因此我们在扩展新值之前分别将x和y减少0.285和0.6。这不是很有趣,因为0倍任何东西是0。
让我们做另一个示例。我们新的树干的新裁剪图像的右下角。让我们称这一点 q 我们知道,在我们的新剪辑映像系统中 q 必须是 q' =(1,1),它是毕竟底部。
我们已经测量了:
w = 7 w'= 3 h = 5 h'= 2
我们减少了多少比例的高度和宽度?
(ww'/w)为(7-3/7)是(4/7)或0.571。我们知道我们必须将W缩放为7/4或1.75或0.571^-1。 h:3/5 - > 5/3 - > 1.6重复。
让我们称这些 s 平原因子 s_h = 5/3和 s_w = 7/4
q'在( 5,7)在 o 中。让我们将我们的公式进行测试。
我们将小时起点以x/w为单位,y/h方向3个小时,让我们称之为ΔW= 2,ΔH= 3。
对于 q'_x ,我们从 q_x 中删除2,因为ΔW= 2。我们得到5-2 = 3。现在,我们通过除以5来归一化3。因此,我们得到 q_x 为3/5。现在,我们按 s_h = 5/3进行扩展,是的5/3倍3/5确实是1。现在我们检查了逻辑,我们可以编写算法。
我们已经具有归一化值的算法
,因此问题更简单。
对于原始的点 p ,我们可以在这样的新图像中计算 p':
in Python:
纠正注释
我们可以裁剪我们需要删除的注释,或者需要部分调整为部分裁剪。
如前所述,所有值必须在间隔[0,1]中。
完全裁剪的注释将具有1+ΔW/2> x<ΔW/2和1+ΔW/2> y&lt&lt&lt&lt<ΔH/2
部分裁剪。
如果您想在仅1/4或更少的区域可见或掉落注释的注释中,则会 范围[0,25,1)将更加复杂。
裁剪图像中的交叉区域
我们可以将此问题视为计算两个矩形之间的交叉区域。为方便起见,该功能还返回框架中面积的百分比。
I was just working this out myself, so here is a complete explanation of why the formula at the bottom is correct.
let's go over how these Annotations are formatted.
Each line is 5 numbers sperated by a space:
n x y w h
withW and H mean the width and height of the original image.
A normalized value is relative to the width or height of the image.Not in pixels or other unit. It a proportion. For example the x value is normalized like this x[px]/W[px] = x normalized.
a few advantages of this:
The y axes goes from top to bottom. everything else is like your standard coordinate system.
Now to cropping. let's take this picture of a tree:
scaling
We will now crop to the top left quarter of the tree image.
our new image width W' is now only half of the original W. also H'= 0.5*H. The center of the old image is now the bottom left corner. We know the center of the image p is at (0.5,0.5). The bottom left corner is at p' =(1,1). If we would crop so (0.3,0.3) in the old image is the new bottom richt the new coordinate would also be at (1,1). 0.5 is also ½ . To get from 0.5 to 1 we need to multiply by 2, for ⅓ *3 , ¼ *4 . We see that if we reduce the the width or height by a/b be need to multiply by b/a.
translation
But we also want to move the top left of the image, our coordinate origin O.
Lets crop to the tree trunk:
W is 7 characters. the new width is W' is 3. H=5 and H' is 2. The new origin O is (0,0) of course and O' is at (2,3) in characters, normalized to the original image ([![2 over 7][2]][2], [![3 over 5][3]][3]) or (0.285,0.6).
O' is (0.285,0.6) but should be (0,0) so we reduce by x and y by 0.285 and 0.6 respectively before we scale the new value. This is not very interesting because 0 times anything is 0.
Let's do another example. the bottom right of our new cropped image of the tree trunk. Let's call this point q we know that q in our new system of the cropped image must be q' =(1,1) , it's the bottom right after all.
We already measured:
W=7 W'=3 H=5 H'=2
By how much did we reduce height and width as a proportion?
(W-W'/W) is (7-3/7) is (4/7) or 0.571 . We know we have to scale W by 7/4 or 1.75 or 0.571^-1 . For H : 3/5 -> 5/3 -> 1.6 repeating.
lets call these scaling factors s_h =5/3 and s_w=7/4
q' is at (5,7) in O . lets put our formula to the test.
we moved hour origin by 2 in x/w and 3 in y/h direction lets call this Δw=2 and Δh=3.
For q'_x we remove 2 from q_x because Δw=2. we get 5-2=3. now we normalize 3 by dividing by 5. so we get q_x is 3/5. now we scale by s_h= 5/3 and yes 5/3 times 3/5 is indeed 1. Now that we checked our logic we can write an algorithm.
The algorithm
We already have normalized values so the matter is simpler.
For a point p in the original we can calculate p' in the new image like this:
in python:
correcting annotations
We could crop out annotations that we need to drop, or adjust to being partially cropped out.
As mentioned before all values must be in the interval [0,1].
Completely cropped out annotations will have 1+Δw/2>x<Δw/2 and 1+Δw/2>y<Δh/2
partially cropped
if you want to include annotations with only 1/4 or less area visible or drop annotations in the range [0,25,1) it will be more complicated.
intersection area in cropped image
we can view this problem as calculating the intersection area between two rectangles. For convenience the function also returns the percentage of area in frame.