使用透视变换倾斜图像

发布于 2024-08-24 18:56:45 字数 474 浏览 10 评论 0原文

我正在尝试对图像执行倾斜,如此处所示


(来源:microsoft.com< /a>)

我有一组代表我的图像的像素,但不确定如何处理它们。

I'm trying to perform a skew on an image, like one shown here


(source: microsoft.com)
.

I have an array of pixels representing my image and am unsure of what to do with them.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

美人迟暮 2024-08-31 18:56:45

更好的方法是通过逆映射。

本质上,您想要“扭曲”图像,对吗?这意味着源图像中的每个像素都会到达预定义的点 - 预定义是一个变换矩阵,它告诉您如何旋转、缩放、平移、剪切等。图像本质上是采用一些坐标 (x,y ) 在你的图像上并说,“好吧,这个像素的新位置是 (f(x),g(y))

这本质上就是“扭曲”的作用。

现在,考虑缩放图像……例如,缩放到十倍大小,这意味着 (1,1) 处的像素变为 (10,10) 处的像素。 code> - 然后下一个像素 (1,2) 成为新图像中的像素 (10,20) 但如果你继续这样做,你就会。没有像素值,(13,13),因为 (1.3,1.3) 没有在您的原始图像中定义,并且您的图像中会有一堆洞新图像 - 您必须使用新图像中该值周围的四个像素来插值该值,即 (10,10) , (10,20), (20,10), (200,2) - 这称为双线性插值

但这是另一个问题,假设您的转换不是简单的缩放并且是仿射的(就像您发布的示例图像) - 那么 (1,1) 将变成类似 (2.34, 4.21) 然后你必须将输出图像中的它们舍入为 (2,4) 并且然后你必须对新图像来填补漏洞或更复杂的插值 - 混乱吧?

现在,没有办法摆脱插值,但我们可以摆脱双线性插值,只需一次。如何?简单的逆映射。

不要将其视为源图像到新图像的过程,而是考虑新图像的数据将来自源图像中的何处!因此,新图像中的 (1,1) 将来自源图像中的某些反向映射,例如 (3.4, 2.1),然后对源图片算出对应的值!

变换矩阵

好的,那么如何定义仿射变换的变换矩阵呢? 这个网站告诉你如何通过组合不同的旋转、剪切等变换矩阵来做到这一点。

变换:

替代文本
(来源:mathieu at people.gnome.org )

合成:

alt 文字
(来源:mathieu at people.gnome.org)

最终矩阵可以通过按顺序组合每个矩阵来实现,然后反转它以获得逆映射 - 使用它计算源图像中像素的位置并进行插值。

A much better way to do this is by inverse mapping.

Essentially, you want to "warp" the image, right? Which means every pixel in the source image goes to a predefined point - the predefinition is a transformation matrix which tells you how to rotate, scale, translate, shear, etc. the image which is essentially taking some coordinate (x,y) on your image and saying that, "Ok, the new position for this pixel is (f(x),g(y)).

That's essentially what "warping" does.

Now, think about scaling an image ... say, to ten times the size. So that means, the pixel at (1,1) becomes the pixel at (10,10) - and then the next pixel, (1,2) becomes the pixel (10,20) in the new image. But if you keep doing this, you will have no values for a pixel, (13,13) because, (1.3,1.3) is not defined in your original image and you will have a bunch of holes in your new image - you'll have to interpolate for that value using the four pixels around it in the new image, i.e. (10,10) , (10,20), (20,10), (200,2) - this is called bilinear interpolation.

But here's another problem, suppose your transformation wasn't simple scaling and was affine (like the sample image you've posted)- then (1,1) would become something like (2.34,4.21) and then you'd have to round them in the output image to (2,4) and then you'd have to do bilinear interpolation on the new image to fill in the holes or more complicated interpolation - messy right?

Now, there's no way to get out of interpolation, but we can get away with doing bilinear interpolation, just once. How? Simple, inverse mapping.

Instead of looking at it as the source image going to the new image, think of where the data for the new image will come from in the source image! So, (1,1) in the new image will come from some reverse mapping in the source image, say, (3.4, 2.1) and then do bilinear interpolation on the source image to figure out the corresponding value!

Transformation matrix

Ok, so how do you define a transformation matrix for an affine transformation? This website tells you how to do it by compositing different transformation matrices for rotation, shearing, etc.

Transformations:

alt text
(source: mathieu at people.gnome.org)

Compositing:

alt text
(source: mathieu at people.gnome.org)

The final matrix can be achieved by compositing each matrix in the order and you invert it to get the the inverse mapping - use this compute the positions of the pixels in the source image and interpolate.

终难遇 2024-08-31 18:56:45

如果您不想重新发明轮子,请查看 OpenCV 库。它实现了许多有用的图像处理功能,包括透视变换。查看 cvWarpPerspective,我用它来轻松完成此任务。

If you don't feel like re-inventing the wheel, check out the OpenCV library. It implements many useful image processing functions including perspective transformations. Check out the cvWarpPerspective which I've used to accomplish this task quite easily.

凑诗 2024-08-31 18:56:45

正如 KennyTM 所评论的,您只需要一个仿射变换,它是通过将每个像素乘以矩阵 M 并将结果添加到平移向量 V 获得的线性映射。这是一个简单的数学运算

end_pixel_position = M*start_pixel_position + V

,其中 M 是旋转或缩放等简单变换的组合,V 是一个向量,通过向每个像素添加固定系数来平移图像的每个点。

例如,如果您想要旋转图像,您可以将旋转矩阵定义为:

    | cos(a) -sin(a) |
M = |                |
    | sin(a)  cos(a) |

其中 a 是您想要旋转图像的角度。

缩放使用以下形式的矩阵:

    | s1   0 |
M = |        |
    | 0   s2 |

其中 s1 和 s2 是两个轴上的缩放因子。

对于翻译,您只需向量 V

    | t1 |
V = |    |
    | t2 |

t1t2 添加到像素坐标。

然后,您可以将矩阵合并到一个转换中,例如,如果您进行缩放、旋转和平移,您最终会得到如下结果:

| x2 |             | x1 |
|    | = M1 * M2 * |    | + T
| y2 |             | y1 |

其中:

  • x1y1 是应用变换之前的像素坐标,
  • x2y2 是变换后的像素,
  • M1M2 是使用的矩阵用于缩放和旋转(记住:矩阵的组合是不可交换的!通常M1 * M2 * Vect!= M2 * M1 * Vect),
  • T 是用于平移每个像素的平移向量。

As commented by KennyTM you just need an affine transform that is a linear mapping obtained by multiplying every pixel by a matrix M and adding the result to a translation vector V. It's simple math

end_pixel_position = M*start_pixel_position + V

where M is a composition of simple transformations like rotations or scalings and V is a vector that translates every point of your images by adding fixed coefficients to every pixel.

For example if you want to rotate the image you can have a rotation matrix defined as:

    | cos(a) -sin(a) |
M = |                |
    | sin(a)  cos(a) |

where a is the angle you want to rotate your image by.

While scaling uses a matrix of the form:

    | s1   0 |
M = |        |
    | 0   s2 |

where s1 and s2 are scaling factors on both axis.

For translation you just have the vector V:

    | t1 |
V = |    |
    | t2 |

that adds t1 and t2 to pixel coordinates.

You then combine the matrixes in one single transformation, for example if you have either scaling, rotation and translation you'll end up having something like:

| x2 |             | x1 |
|    | = M1 * M2 * |    | + T
| y2 |             | y1 |

where:

  • x1 and y1 are pixel coordinates before applying the transform,
  • x2 and y2 are pixels after the transform,
  • M1 and M2 are matrixes used for scaling and rotation (REMEMBER: the composition of matrixes is not commutative! Usually M1 * M2 * Vect != M2 * M1 * Vect),
  • T is a translation vector use to translate every pixel.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文