反转实价索引网格

发布于 2025-02-01 23:07:06 字数 929 浏览 4 评论 0 原文

OPENCV的

确切地说,让:

A = an image 
X = a grid of real-valued X coords into the image. 
Y = a grid of real-valued Y coords into the image.
B = remap(A, X, Y)

所有像素坐标I,j

B[i, j] = A(X[i, j], Y[i, j]) 

对于 有价值的坐标 x y

我的问题是:给定索引网格 x y ,如何生成“倒格网格” x^-1 y^-1 这样:

X(X^-1[i, j], Y^-1[i, j]) = i
Y(X^-1[i, j], Y^-1[i, j]) = j

对于

X^-1(X[i, j], Y[i, j]) = i
Y^-1(X[i, j], Y[i, j]) = j

所有整数像素坐标 i,j

FWIW,图像和索引映射X和Y的形状相同。但是,索引X和Y没有先验结构。例如,它们不一定是仿射或刚性变换。它们甚至可能是不可避免的,例如,如果 x,y a 中的多个像素映射到B中相同的精确像素坐标。如果存在一个合理的逆映射。

该解决方案不必基于OPENCV,因为我不使用OPENCV,而是另一个具有 remap()实现的库。虽然欢迎任何建议,但我特别热衷于“数学上正确”的东西,即如果我的地图M完全可逆,则该方法应该在机器精度的一些较小的余量中找到完美的倒数。

OpenCV's remap() uses a real-valued index grid to sample a grid of values from an image using bilinear interpolation, and returns the grid of samples as a new image.

To be precise, let:

A = an image 
X = a grid of real-valued X coords into the image. 
Y = a grid of real-valued Y coords into the image.
B = remap(A, X, Y)

Then for all pixel coordinates i, j,

B[i, j] = A(X[i, j], Y[i, j]) 

Where the round-braces notation A(x, y) denotes using bilinear interpolation to solve for the pixel value of image A using float-valued coords x and y.

My question is: given an index grid X, Y, how can I generate an "inverse grid" X^-1, Y^-1 such that:

X(X^-1[i, j], Y^-1[i, j]) = i
Y(X^-1[i, j], Y^-1[i, j]) = j

And

X^-1(X[i, j], Y[i, j]) = i
Y^-1(X[i, j], Y[i, j]) = j

For all integer pixel coordinates i, j?

FWIW, the image and index maps X and Y are the same shape. However, there is no a priori structure to the index maps X and Y. For example, they're not necessarily affine or rigid transforms. They may even be uninvertible, e.g. if X, Y maps multiple pixels in A to the same exact pixel coordinate in B. I'm looking for ideas for a method that will find a reasonable inverse map if one exists.

The solution need not be OpenCV-based, as I'm not using OpenCV, but another library that has a remap() implementation. While any suggestions are welcome, I'm particularly keen on something that's "mathematically correct", i.e. if my map M is perfectly invertible, the method should find the perfect inverse, within some small margin of machine precision.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(15

城歌 2025-02-08 23:07:06

迭代解决方案

以上许多解决方案对我不起作用,当地图不可逆转或不快速时失败。

我提出了一种替代的6线迭代解决方案。

def invert_map(F):
    I = np.zeros_like(F)
    I[:,:,1], I[:,:,0] = np.indices(sh)
    P = np.copy(I)
    for i in range(10):
        P += I - cv.remap(F, P, None, interpolation=cv.INTER_LINEAR)
    return P

它的表现如何?
对于我的用例,即用于航空摄影的地形校正图,此方法以10个步骤舒适地收敛到像素的1/10。它也很快,因为所有重型计算都被塞入OpenCV

如何工作?

该方法使用以下想法:如果(x',y')= f(x,y) 是一个映射,然后只要 f 很小。

我们可以继续完善我们的映射,以上是我们的第一个预测(我是“身份映射”):

g_1 = i -f

我们的第二个预测可以从中进行调整:

g_2 = g_1 + i -f(g_1)

等:

g_n + 1 = g_n + i -i -f(g_n)

证明 g_n 收敛到反向 f^-1 很难,但是我们可以轻松证明的是,如果 g 已收敛,它将保持融合。

假设 g_n = f^-1 ,然后我们可以替换为:

g_n + 1 = g_n + i -f(g_n)

,然后获取:

G_n+1 = F^-1 + I - F(F^-1)
G_n+1 = F^-1 + I - I
G_n+1 = F^-1
Q.E.D.

测试脚本< /strong>

import cv2 as cv
from scipy import ndimage as ndi
import numpy as np
from matplotlib import pyplot as plt

# Simulate deformation field
N = 500
sh = (N, N)
t = np.random.normal(size=sh)
dx = ndi.gaussian_filter(t, 40, order=(0,1))
dy = ndi.gaussian_filter(t, 40, order=(1,0))
dx *= 10/dx.max()
dy *= 10/dy.max()

# Test image
img = np.zeros(sh)
img[::10, :] = 1
img[:, ::10] = 1
img = ndi.gaussian_filter(img, 0.5)

# Apply forward mapping
yy, xx = np.indices(sh)
xmap = (xx-dx).astype(np.float32)
ymap = (yy-dy).astype(np.float32)
warped = cv.remap(img, xmap, ymap ,cv.INTER_LINEAR)
plt.imshow(warped, cmap='gray')

输出1“

def invert_map(F: np.ndarray):
    I = np.zeros_like(F)
    I[:,:,1], I[:,:,0] = np.indices(sh)
    P = np.copy(I)
    for i in range(10):
        P += I - cv.remap(F, P, None, interpolation=cv.INTER_LINEAR)
    return P

# F: The function to invert
F = np.zeros((sh[0], sh[1], 2), dtype=np.float32)
F[:,:,0], F[:,:,1] = (xmap, ymap)

# Test the prediction
unwarped = cv.remap(warped, invert_map(F), None, cv.INTER_LINEAR)
plt.imshow(unwarped, cmap='gray')

”在此处输入图像说明”

Iterative solution

Many of the above solutions didn't work for me, failed when the map wasn't invertible, or weren't terribly fast.

I present an alternative, 6-line iterative solution.

def invert_map(F):
    I = np.zeros_like(F)
    I[:,:,1], I[:,:,0] = np.indices(sh)
    P = np.copy(I)
    for i in range(10):
        P += I - cv.remap(F, P, None, interpolation=cv.INTER_LINEAR)
    return P

How well does it do?
For my use case of inverting a terrain correction map for aerial photography, this method converges comfortably in 10 steps to 1/10th of a pixel. It's also blazingly fast, because all the heavy compute is tucked inside OpenCV

How does it work?

The approach uses the idea that if (x', y') = F(x, y) is a mapping, then the inverse can be approximated with (x, y) = -F(x', y'), as long as the gradient of F is small.

We can continue to refine our mapping, the above gets us our first prediction (I is an "identity mapping"):

G_1 = I - F

Our second prediction can be adapted from that:

G_2 = G_1 + I - F(G_1)

and so on:

G_n+1 = G_n + I - F(G_n)

Proving that G_n converges to the inverse F^-1 is hard, but what we can easily prove is that if G has converged, it will stay converged.

Assume G_n = F^-1, then we can substitute into:

G_n+1 = G_n + I - F(G_n)

and then get:

G_n+1 = F^-1 + I - F(F^-1)
G_n+1 = F^-1 + I - I
G_n+1 = F^-1
Q.E.D.

Testing script

import cv2 as cv
from scipy import ndimage as ndi
import numpy as np
from matplotlib import pyplot as plt

# Simulate deformation field
N = 500
sh = (N, N)
t = np.random.normal(size=sh)
dx = ndi.gaussian_filter(t, 40, order=(0,1))
dy = ndi.gaussian_filter(t, 40, order=(1,0))
dx *= 10/dx.max()
dy *= 10/dy.max()

# Test image
img = np.zeros(sh)
img[::10, :] = 1
img[:, ::10] = 1
img = ndi.gaussian_filter(img, 0.5)

# Apply forward mapping
yy, xx = np.indices(sh)
xmap = (xx-dx).astype(np.float32)
ymap = (yy-dy).astype(np.float32)
warped = cv.remap(img, xmap, ymap ,cv.INTER_LINEAR)
plt.imshow(warped, cmap='gray')

output 1

def invert_map(F: np.ndarray):
    I = np.zeros_like(F)
    I[:,:,1], I[:,:,0] = np.indices(sh)
    P = np.copy(I)
    for i in range(10):
        P += I - cv.remap(F, P, None, interpolation=cv.INTER_LINEAR)
    return P

# F: The function to invert
F = np.zeros((sh[0], sh[1], 2), dtype=np.float32)
F[:,:,0], F[:,:,1] = (xmap, ymap)

# Test the prediction
unwarped = cv.remap(warped, invert_map(F), None, cv.INTER_LINEAR)
plt.imshow(unwarped, cmap='gray')

enter image description here

指尖上得阳光 2025-02-08 23:07:06

好吧,我只需要自己解决这个我自己自己就可以概述解决方案。

给定 x y remap()函数,该功能执行以下:

B[i, j] = A(X[i, j], Y[i, j])   

我计算 XINV yinv 可以由 remap()函数使用 invert the Process:

A[x, y] = B(Xinv[x,y],Yinv[x,y])

首先我构建a n 最近的邻居(X,Y)。我使用Euclidian距离,我找到了一个很好的 c ++ header lib for kd-trees 在github上

。 > A 的网格中的值,查找 n = 5 最近的邻居 {(x [i_k,j_k],y [i_k,j_k])| k .. n-1} 在我的点集中。

  • = 0 ,y] = i_k 和 yinv [x,y] = j_k ,否则...

  • 使用逆距离加权(idw)计算插值值:

    • 让重量 w_k = 1/pow(d_k,p)(我使用 p = 2
    • XINV [x,y] =(sum_k w_k * i_k)/(sum_k w_k)
    • yinv [x,y] =(sum_k w_k * j_k)/(sum_k w_k)

请注意,如果 b a w x H 图像然后 X y w x H 浮子的数组。如果 a WX H 图像,则 XINV yinv WX H arrays对于浮子。重要的是,您必须与图像和地图尺寸一致。

像魅力一样工作!我的第一个版本我尝试了蛮力强迫搜索,但我什至从未等待过它的完成。我切换到KD-Tree,然后开始获得合理的运行时间。如果我有时间我想将其添加到OpenCV中。

下图是使用 remap()以从第一个图像中删除镜头失真。第三张图像是颠倒过程的结果。

”在此处输入图像描述” ”在此处输入图像描述”

Well I just had to solve this remap inversion problem myself and I'll outline my solution.

Given X, Y for the remap() function that does the following:

B[i, j] = A(X[i, j], Y[i, j])   

I computed Xinv, Yinv that can be used by the remap() function to invert the process:

A[x, y] = B(Xinv[x,y],Yinv[x,y])

First I build a KD-Tree for the 2D point set {(X[i,j],Y[i,j]} so I can efficiently find the N nearest neighbors to a given point (x,y). I use Euclidian distance for my distance metric. I found a great C++ header lib for KD-Trees on GitHub.

Then I loop thru all the (x,y) values in A's grid and find the N = 5 nearest neighbors {(X[i_k,j_k],Y[i_k,j_k]) | k = 0 .. N-1} in my point set.

  • If distance d_k == 0 for some k then Xinv[x,y] = i_k and Yinv[x,y] = j_k, otherwise...

  • Use Inverse Distance Weighting (IDW) to compute an interpolated value:

    • let weight w_k = 1 / pow(d_k, p) (I use p = 2)
    • Xinv[x,y] = (sum_k w_k * i_k)/(sum_k w_k)
    • Yinv[x,y] = (sum_k w_k * j_k)/(sum_k w_k)

Note that if B is a W x H image then X and Y are W x H arrays of floats. If A is a w x h image then Xinv and Yinv are w x h arrays for floats. It is important that you are consistent with image and map sizing.

Works like a charm! My first version I tried brute forcing the search and I never even waited for it to finish. I switched to a KD-Tree then I started to get reasonable run times. I f I ever get time I would like to add this to OpenCV.

The second image below is use remap() to remove the lens distortion from the first image. The third image is a result of inverting the process.

enter image description here enter image description here enter image description here

慵挽 2025-02-08 23:07:06

这是一个重要的问题,令我惊讶的是,在任何标准库中都没有得到更好的解决(至少据我所知)。

我对接受的解决方案不满意,因为它没有使用转换的隐性平滑度。我可能会错过重要的情况,但我无法想象在任何有用的意义上都可以逆转映射,而且在像素量表上都不平滑。

平滑度意味着无需计算最近的邻居:最近的点是原始网格上已经接近的点。

我的解决方案使用了以下事实:在原始映射中,一个正方形[(i,j),(i+1,j),(i+1,j+1),(i,j+1)]四边形[(x [i,j],y [i,j],x [i+1,j],y [i+1,j],... 在四边形内进行插值

仅 四边形进行逐步构建,我在这里复制代码,希望有足够的评论使这个想法

对不太明显的内容进行了清晰的评论

  • 要通过所有 。范围[0,1]边缘上的点,我实际上允许从[0,1]范围内提出积分,这通常意味着索引可以由两个相邻的四边形拾取。在这些极少数情况下,我只是让结果是两个结果的平均值,相信范围内的点是以合理的方式“推断”的。
  • 通常,所有四边形的形状都不同,并且它们与常规网格的重叠可能会从一无所有变化多个点。该例程立即解决所有四边形(要利用 bilinear_inverse 的矢量化性质,但是在每次迭代中,仅选择坐标(偏移到其边界框)的四边形是有效的。
import numpy as np

def bilinear_inverse(p, vertices, numiter=4):
    """
    Compute the inverse of the bilinear map from the unit square
    [(0,0), (1,0), (1,1), (0,1)]
    to the quadrilateral vertices = [p0, p1, p2, p4]

    Parameters:
    ----------
    p: array of shape (2, ...)
        Points on which the inverse transforms are applied.
    vertices: array of shape (4, 2, ...)
        Coordinates of the vertices mapped to the unit square corners
    numiter:
        Number of Newton interations

    Returns:
    --------
    s: array of shape (2, ...)
        Mapped points.

    This is a (more general) python implementation of the matlab implementation 
    suggested in https://stackoverflow.com/a/18332009/1560876
    """

    p = np.asarray(p)
    v = np.asarray(vertices)
    sh = p.shape[1:]
    if v.ndim == 2:
        v = np.expand_dims(v, axis=tuple(range(2, 2 + len(sh))))

    # Start in the center
    s = .5 * np.ones((2,) + sh)
    s0, s1 = s
    for k in range(numiter):
        # Residual
        r = v[0] * (1 - s0) * (1 - s1) + v[1] * s0 * (1 - s1) + v[2] * s0 * s1 + v[3] * (1 - s0) * s1 - p

        # Jacobian
        J11 = -v[0, 0] * (1 - s1) + v[1, 0] * (1 - s1) + v[2, 0] * s1 - v[3, 0] * s1
        J21 = -v[0, 1] * (1 - s1) + v[1, 1] * (1 - s1) + v[2, 1] * s1 - v[3, 1] * s1
        J12 = -v[0, 0] * (1 - s0) - v[1, 0] * s0 + v[2, 0] * s0 + v[3, 0] * (1 - s0)
        J22 = -v[0, 1] * (1 - s0) - v[1, 1] * s0 + v[2, 1] * s0 + v[3, 1] * (1 - s0)

        inv_detJ = 1. / (J11 * J22 - J12 * J21)

        s0 -= inv_detJ * (J22 * r[0] - J12 * r[1])
        s1 -= inv_detJ * (-J21 * r[0] + J11 * r[1])

    return s


def invert_map(xmap, ymap, diagnostics=False):
    """
    Generate the inverse of deformation map defined by (xmap, ymap) using inverse bilinear interpolation.
    """

    # Generate quadrilaterals from mapped grid points.
    quads = np.array([[ymap[:-1, :-1], xmap[:-1, :-1]],
                      [ymap[1:, :-1], xmap[1:, :-1]],
                      [ymap[1:, 1:], xmap[1:, 1:]],
                      [ymap[:-1, 1:], xmap[:-1, 1:]]])

    # Range of indices possibly within each quadrilateral
    x0 = np.floor(quads[:, 1, ...].min(axis=0)).astype(int)
    x1 = np.ceil(quads[:, 1, ...].max(axis=0)).astype(int)
    y0 = np.floor(quads[:, 0, ...].min(axis=0)).astype(int)
    y1 = np.ceil(quads[:, 0, ...].max(axis=0)).astype(int)

    # Quad indices
    i0, j0 = np.indices(x0.shape)

    # Offset of destination map
    x0_offset = x0.min()
    y0_offset = y0.min()

    # Index range in x and y (per quad)
    xN = x1 - x0 + 1
    yN = y1 - y0 + 1

    # Shape of destination array
    sh_dest = (1 + x1.max() - x0_offset, 1 + y1.max() - y0_offset)

    # Coordinates of destination array
    yy_dest, xx_dest = np.indices(sh_dest)

    xmap1 = np.zeros(sh_dest)
    ymap1 = np.zeros(sh_dest)
    TN = np.zeros(sh_dest, dtype=int)

    # Smallish number to avoid missing point lying on edges
    epsilon = .01

    # Loop through indices possibly within quads
    for ix in range(xN.max()):
        for iy in range(yN.max()):
            # Work only with quads whose bounding box contain indices
            valid = (xN > ix) * (yN > iy)

            # Local points to check
            p = np.array([y0[valid] + ix, x0[valid] + iy])

            # Map the position of the point in the quad
            s = bilinear_inverse(p, quads[:, :, valid])

            # s out of unit square means p out of quad
            # Keep some epsilon around to avoid missing edges
            in_quad = np.all((s > -epsilon) * (s < (1 + epsilon)), axis=0)

            # Add found indices
            ii = p[0, in_quad] - y0_offset
            jj = p[1, in_quad] - x0_offset

            ymap1[ii, jj] += i0[valid][in_quad] + s[0][in_quad]
            xmap1[ii, jj] += j0[valid][in_quad] + s[1][in_quad]

            # Increment count
            TN[ii, jj] += 1

    ymap1 /= TN + (TN == 0)
    xmap1 /= TN + (TN == 0)

    if diagnostics:
        diag = {'x_offset': x0_offset,
                'y_offset': y0_offset,
                'mask': TN > 0}
        return xmap1, ymap1, diag
    else:
        return xmap1, ymap1

这是一个测试示例

import cv2 as cv
from scipy import ndimage as ndi

# Simulate deformation field
N = 500
sh = (N, N)
t = np.random.normal(size=sh)
dx = ndi.gaussian_filter(t, 40, order=(0,1))
dy = ndi.gaussian_filter(t, 40, order=(1,0))
dx *= 30/dx.max()
dy *= 30/dy.max()

# Test image
img = np.zeros(sh)
img[::10, :] = 1
img[:, ::10] = 1
img = ndi.gaussian_filter(img, 0.5)

# Apply forward mapping
yy, xx = np.indices(sh)
xmap = (xx-dx).astype(np.float32)
ymap = (yy-dy).astype(np.float32)
warped = cv.remap(img, xmap, ymap ,cv.INTER_LINEAR)
plt.imshow(warped, cmap='gray')

< a href =“ https://i.sstatic.net/fz5yz.png” rel =“ noreferrer”>

# Now invert the mapping
xmap1, ymap1 = invert_map(xmap, ymap)

unwarped = cv.remap(warped, xmap1.astype(np.float32), ymap1.astype(np.float32) ,cv.INTER_LINEAR)

plt.imshow(unwarped, cmap='gray')

This is an important problem, and I am surprised that it is not better addressed in any standard library (at least to my knowledge).

I wasn't happy with the accepted solution as it didn't use the implicit smoothness of the transformation. I might miss important cases, but I cannot imagine mapping that are both invertible in any useful sense and non-smooth at the pixel scale.

Smoothness means that there is no need to compute nearest neighbors: the nearest points are those that are already near on the original grid.

My solution uses the fact that, in the original mapping, a square [(i,j), (i+1, j), (i+1, j+1), (i, j+1)] maps to a quadrilateral [(X[i,j], Y[i,j], X[i+1,j], Y[i+1,j], ...] that has no other points inside. Then the inverse mapping only requires interpolation within the quadrilateral. For this I use an inverse bilinear interpolation, which will give exact results at the vertices and for any other affine transform.

The implementation has no other dependency than numpy. The logic is to run through all quadrilaterals and build progressively the reverse mapping. I copy the code here, hopefully there are enough comments to make the idea clear enough.

A few comments on the less obvious stuff:

  • The inverse bilinear function would normally return coordinates only in the range [0,1]. I removed the clipping operation, so that out-of-range values mean that the coordinate is outside of the quadrilateral (that's a contorted way of solving the point-in-polygon problem!). To avoid missing points on the edges, I actually allow for points out of the [0,1] range, which normally means that an index may be picked up by two neighboring quadrilaterals. In these rare cases I just let the result be the average of the two result, trusting that the out-of-range points are "extrapolating" in a reasonable way.
  • In general all quadrilaterals have a different shape, and their overlap with the regular grid can go from nothing at all to vary many points. The routine solves all quadrilateral at once (to exploit the vectorised nature of bilinear_inverse, but at each iteration selects only the quadrilaterals for which the coordinates (offset to their bounding box) are valid.
import numpy as np

def bilinear_inverse(p, vertices, numiter=4):
    """
    Compute the inverse of the bilinear map from the unit square
    [(0,0), (1,0), (1,1), (0,1)]
    to the quadrilateral vertices = [p0, p1, p2, p4]

    Parameters:
    ----------
    p: array of shape (2, ...)
        Points on which the inverse transforms are applied.
    vertices: array of shape (4, 2, ...)
        Coordinates of the vertices mapped to the unit square corners
    numiter:
        Number of Newton interations

    Returns:
    --------
    s: array of shape (2, ...)
        Mapped points.

    This is a (more general) python implementation of the matlab implementation 
    suggested in https://stackoverflow.com/a/18332009/1560876
    """

    p = np.asarray(p)
    v = np.asarray(vertices)
    sh = p.shape[1:]
    if v.ndim == 2:
        v = np.expand_dims(v, axis=tuple(range(2, 2 + len(sh))))

    # Start in the center
    s = .5 * np.ones((2,) + sh)
    s0, s1 = s
    for k in range(numiter):
        # Residual
        r = v[0] * (1 - s0) * (1 - s1) + v[1] * s0 * (1 - s1) + v[2] * s0 * s1 + v[3] * (1 - s0) * s1 - p

        # Jacobian
        J11 = -v[0, 0] * (1 - s1) + v[1, 0] * (1 - s1) + v[2, 0] * s1 - v[3, 0] * s1
        J21 = -v[0, 1] * (1 - s1) + v[1, 1] * (1 - s1) + v[2, 1] * s1 - v[3, 1] * s1
        J12 = -v[0, 0] * (1 - s0) - v[1, 0] * s0 + v[2, 0] * s0 + v[3, 0] * (1 - s0)
        J22 = -v[0, 1] * (1 - s0) - v[1, 1] * s0 + v[2, 1] * s0 + v[3, 1] * (1 - s0)

        inv_detJ = 1. / (J11 * J22 - J12 * J21)

        s0 -= inv_detJ * (J22 * r[0] - J12 * r[1])
        s1 -= inv_detJ * (-J21 * r[0] + J11 * r[1])

    return s


def invert_map(xmap, ymap, diagnostics=False):
    """
    Generate the inverse of deformation map defined by (xmap, ymap) using inverse bilinear interpolation.
    """

    # Generate quadrilaterals from mapped grid points.
    quads = np.array([[ymap[:-1, :-1], xmap[:-1, :-1]],
                      [ymap[1:, :-1], xmap[1:, :-1]],
                      [ymap[1:, 1:], xmap[1:, 1:]],
                      [ymap[:-1, 1:], xmap[:-1, 1:]]])

    # Range of indices possibly within each quadrilateral
    x0 = np.floor(quads[:, 1, ...].min(axis=0)).astype(int)
    x1 = np.ceil(quads[:, 1, ...].max(axis=0)).astype(int)
    y0 = np.floor(quads[:, 0, ...].min(axis=0)).astype(int)
    y1 = np.ceil(quads[:, 0, ...].max(axis=0)).astype(int)

    # Quad indices
    i0, j0 = np.indices(x0.shape)

    # Offset of destination map
    x0_offset = x0.min()
    y0_offset = y0.min()

    # Index range in x and y (per quad)
    xN = x1 - x0 + 1
    yN = y1 - y0 + 1

    # Shape of destination array
    sh_dest = (1 + x1.max() - x0_offset, 1 + y1.max() - y0_offset)

    # Coordinates of destination array
    yy_dest, xx_dest = np.indices(sh_dest)

    xmap1 = np.zeros(sh_dest)
    ymap1 = np.zeros(sh_dest)
    TN = np.zeros(sh_dest, dtype=int)

    # Smallish number to avoid missing point lying on edges
    epsilon = .01

    # Loop through indices possibly within quads
    for ix in range(xN.max()):
        for iy in range(yN.max()):
            # Work only with quads whose bounding box contain indices
            valid = (xN > ix) * (yN > iy)

            # Local points to check
            p = np.array([y0[valid] + ix, x0[valid] + iy])

            # Map the position of the point in the quad
            s = bilinear_inverse(p, quads[:, :, valid])

            # s out of unit square means p out of quad
            # Keep some epsilon around to avoid missing edges
            in_quad = np.all((s > -epsilon) * (s < (1 + epsilon)), axis=0)

            # Add found indices
            ii = p[0, in_quad] - y0_offset
            jj = p[1, in_quad] - x0_offset

            ymap1[ii, jj] += i0[valid][in_quad] + s[0][in_quad]
            xmap1[ii, jj] += j0[valid][in_quad] + s[1][in_quad]

            # Increment count
            TN[ii, jj] += 1

    ymap1 /= TN + (TN == 0)
    xmap1 /= TN + (TN == 0)

    if diagnostics:
        diag = {'x_offset': x0_offset,
                'y_offset': y0_offset,
                'mask': TN > 0}
        return xmap1, ymap1, diag
    else:
        return xmap1, ymap1

Here's a test example

import cv2 as cv
from scipy import ndimage as ndi

# Simulate deformation field
N = 500
sh = (N, N)
t = np.random.normal(size=sh)
dx = ndi.gaussian_filter(t, 40, order=(0,1))
dy = ndi.gaussian_filter(t, 40, order=(1,0))
dx *= 30/dx.max()
dy *= 30/dy.max()

# Test image
img = np.zeros(sh)
img[::10, :] = 1
img[:, ::10] = 1
img = ndi.gaussian_filter(img, 0.5)

# Apply forward mapping
yy, xx = np.indices(sh)
xmap = (xx-dx).astype(np.float32)
ymap = (yy-dy).astype(np.float32)
warped = cv.remap(img, xmap, ymap ,cv.INTER_LINEAR)
plt.imshow(warped, cmap='gray')

Warped image

# Now invert the mapping
xmap1, ymap1 = invert_map(xmap, ymap)

unwarped = cv.remap(warped, xmap1.astype(np.float32), ymap1.astype(np.float32) ,cv.INTER_LINEAR)

plt.imshow(unwarped, cmap='gray')

Unwarpped image

梦幻之岛 2025-02-08 23:07:06

您可以在已知点上倒置映射并将其插入新的网格中。
它会正常工作,而失真不是很大。

这是使用scipy.interpaly.griddata在Python中非常简单的实现:

map_x, map_y = cv2.initUndistortRectifyMap(K, D, None, new_K, image_size, cv2.CV_32FC1)

points =  np.stack([map_x.flatten(), map_y.flatten()], axis=1)
grid = np.mgrid[:map_x.shape[0], :map_y.shape[1]]
values = grid.reshape(2, -1).T[..., ::-1] 

from scipy.interpolate import griddata
grid_y, grid_x = grid
map_back = griddata(points, values, (grid_x, grid_y), method='cubic').astype(map_undistort.dtype)

如果您将CV_32FC2用于地图,则可以简化点构造:

map_undistort, _ = cv2.initUndistortRectifyMap(K, D, None, new_K, image_size, cv2.CV_32FC2)
points = map_undistort.reshape(-1, 2)

You can invert map at known points and interpolate it into new grid.
It will work fine, while distortion is not very huge.

Here is very simple implementation in Python using scipy.interpolate.griddata:

map_x, map_y = cv2.initUndistortRectifyMap(K, D, None, new_K, image_size, cv2.CV_32FC1)

points =  np.stack([map_x.flatten(), map_y.flatten()], axis=1)
grid = np.mgrid[:map_x.shape[0], :map_y.shape[1]]
values = grid.reshape(2, -1).T[..., ::-1] 

from scipy.interpolate import griddata
grid_y, grid_x = grid
map_back = griddata(points, values, (grid_x, grid_y), method='cubic').astype(map_undistort.dtype)

If you use CV_32FC2 for maps, you can simplify points construction:

map_undistort, _ = cv2.initUndistortRectifyMap(K, D, None, new_K, image_size, cv2.CV_32FC2)
points = map_undistort.reshape(-1, 2)
秋日私语 2025-02-08 23:07:06

如果您的映射是从同构派生 h 的映射,则可以倒置 h ,并直接使用 cv :: initundististOrtrectifymap()创建倒数映射。

例如,在Python中:

import numpy as np.
map_size = () # fill in your map size
H_inv = np.linalg.inv(H)
map1, map2 = cv2.initUndistortRectifyMap(cameraMatrix=np.eye(3), distCoeffs=np.zeros(5), R=H_inv, newCameraMatrix=np.eye(3), size=map_size, m1type=cv2.CV_32FC1)

OpenCV文档指出 initundistortrectifymap()

该功能实际构建了逆映射的地图
remap()使用的算法。也就是说,对于每个像素(u,v)
目标图像,该函数计算相应的
源图像中的坐标。

如果您刚刚给出了地图,则必须自己做。
Hoewever,新地图坐标的插值并不小,因为一个像素的支撑区域可能很大。

这是一个简单的Python解决方案,它通过进行点对点映射来颠倒地图。这可能会使一些坐标未分配,而另一些则将多次更新。因此,地图中可能有孔。

这是一个小型Python程序,展示了这两种方法:

import cv2
import numpy as np


def invert_maps(map_x, map_y):
    assert(map_x.shape == map_y.shape)
    rows = map_x.shape[0]
    cols = map_x.shape[1]
    m_x = np.ones(map_x.shape, dtype=map_x.dtype) * -1
    m_y = np.ones(map_y.shape, dtype=map_y.dtype) * -1
    for i in range(rows):
        for j in range(cols):
            i_ = round(map_y[i, j])
            j_ = round(map_x[i, j])
            if 0 <= i_ < rows and 0 <= j_ < cols:
                m_x[i_, j_] = j
                m_y[i_, j_] = i
    return m_x, m_y


def main():
    img = cv2.imread("pigeon.png", cv2.IMREAD_GRAYSCALE)

    # a simply rotation by 45 degrees
    H = np.array([np.sin(np.pi/4), -np.cos(np.pi/4), 0, np.cos(np.pi/4), np.sin(np.pi/4), 0, 0, 0, 1]).reshape((3,3))
    H_inv = np.linalg.inv(H)
    map_size = (img.shape[1], img.shape[0])

    map1, map2 = cv2.initUndistortRectifyMap(cameraMatrix=np.eye(3), distCoeffs=np.zeros(5), R=H, newCameraMatrix=np.eye(3), size=map_size, m1type=cv2.CV_32FC1)
    map1_inv, map2_inv = cv2.initUndistortRectifyMap(cameraMatrix=np.eye(3), distCoeffs=np.zeros(5), R=H_inv, newCameraMatrix=np.eye(3), size=map_size, m1type=cv2.CV_32FC1)
    map1_simple_inv, map2_simple_inv = invert_maps(map1, map2)

    img1 = cv2.remap(src=img, map1=map1, map2=map2, interpolation=cv2.INTER_LINEAR)
    img2 = cv2.remap(src=img1, map1=map1_inv, map2=map2_inv, interpolation=cv2.INTER_LINEAR)
    img3 = cv2.remap(src=img1, map1=map1_simple_inv, map2=map2_simple_inv,
                               interpolation=cv2.INTER_LINEAR)

    cv2.imshow("Original image", img)
    cv2.imshow("Mapped image", img1)
    cv2.imshow("Mapping forth and back with H_inv", img2)
    cv2.imshow("Mapping forth and back with invert_maps()", img3)
    cv2.waitKey(0)


if __name__ == '__main__':
    main()

If you map is derived from a homography H you could invert H and directly create the inverse maps with cv::initUndistortRectifyMap().

e.g. in Python:

import numpy as np.
map_size = () # fill in your map size
H_inv = np.linalg.inv(H)
map1, map2 = cv2.initUndistortRectifyMap(cameraMatrix=np.eye(3), distCoeffs=np.zeros(5), R=H_inv, newCameraMatrix=np.eye(3), size=map_size, m1type=cv2.CV_32FC1)

The OpenCV documentation states about initUndistortRectifyMap():

The function actually builds the maps for the inverse mapping
algorithm that is used by remap(). That is, for each pixel (u, v) in
the destination image, the function computes the corresponding
coordinates in the source image.

In the case you have just given the maps, you have to do it by yourself.
Hoewever, interpolation of the new maps' coordinates is not trivial, because the support region for one pixel could be very large.

Here is a simple Python solution which inverts the maps by doing point-to-point mapping. This will probably leave some coordinates unassigned, while others will be updated several times. So there may be holes in the map.

Here is a small Python program demonstrating both approaches:

import cv2
import numpy as np


def invert_maps(map_x, map_y):
    assert(map_x.shape == map_y.shape)
    rows = map_x.shape[0]
    cols = map_x.shape[1]
    m_x = np.ones(map_x.shape, dtype=map_x.dtype) * -1
    m_y = np.ones(map_y.shape, dtype=map_y.dtype) * -1
    for i in range(rows):
        for j in range(cols):
            i_ = round(map_y[i, j])
            j_ = round(map_x[i, j])
            if 0 <= i_ < rows and 0 <= j_ < cols:
                m_x[i_, j_] = j
                m_y[i_, j_] = i
    return m_x, m_y


def main():
    img = cv2.imread("pigeon.png", cv2.IMREAD_GRAYSCALE)

    # a simply rotation by 45 degrees
    H = np.array([np.sin(np.pi/4), -np.cos(np.pi/4), 0, np.cos(np.pi/4), np.sin(np.pi/4), 0, 0, 0, 1]).reshape((3,3))
    H_inv = np.linalg.inv(H)
    map_size = (img.shape[1], img.shape[0])

    map1, map2 = cv2.initUndistortRectifyMap(cameraMatrix=np.eye(3), distCoeffs=np.zeros(5), R=H, newCameraMatrix=np.eye(3), size=map_size, m1type=cv2.CV_32FC1)
    map1_inv, map2_inv = cv2.initUndistortRectifyMap(cameraMatrix=np.eye(3), distCoeffs=np.zeros(5), R=H_inv, newCameraMatrix=np.eye(3), size=map_size, m1type=cv2.CV_32FC1)
    map1_simple_inv, map2_simple_inv = invert_maps(map1, map2)

    img1 = cv2.remap(src=img, map1=map1, map2=map2, interpolation=cv2.INTER_LINEAR)
    img2 = cv2.remap(src=img1, map1=map1_inv, map2=map2_inv, interpolation=cv2.INTER_LINEAR)
    img3 = cv2.remap(src=img1, map1=map1_simple_inv, map2=map2_simple_inv,
                               interpolation=cv2.INTER_LINEAR)

    cv2.imshow("Original image", img)
    cv2.imshow("Mapped image", img1)
    cv2.imshow("Mapping forth and back with H_inv", img2)
    cv2.imshow("Mapping forth and back with invert_maps()", img3)
    cv2.waitKey(0)


if __name__ == '__main__':
    main()
一抹淡然 2025-02-08 23:07:06

这是@wcochran答案的实现。我试图恢复镜头校正的镜头校正。

mod = lensfunpy.Modifier(lens, cam.crop_factor, width, height)
mod.initialize(focal_length, aperture, distance)

undist_coords = mod.apply_geometry_distortion()

## the lens correction part
# im_undistorted = cv2.remap(im, undist_coords, None, cv2.INTER_CUBIC)

# im_undistorted = cv2.remap(im, undist_coords, None, cv2.INTER_LANCZOS4)
# cv2.imwrite(undistorted_image_path, im_undistorted)
undist_coords_f = undist_coords.reshape((-1, 2))
tree = KDTree(undist_coords_f)
def calc_val(point_pos):
    nearest_dist, nearest_ind = tree.query([point_pos], k=5)
    if nearest_dist[0][0] == 0:
        return undist_coords_f[nearest_ind[0][0]]
    # starts inverse distance weighting
    w = np.array([1.0 / pow(d, 2) for d in nearest_dist])
    sw = np.sum(w)
    # embed()
    x_arr = np.floor(nearest_ind[0] / 1080)
    y_arr = (nearest_ind[0] % 1080)
    xx = np.sum(w * x_arr) / sw
    yy = np.sum(w * y_arr) / sw
    return (xx, yy)

un_correction_x = np.zeros((720, 1080))
un_correction_y = np.zeros((720, 1080))

## reverse the lens correction
for i in range(720):
    print("row %d operating" % i)
    for j in range(1080):
        un_correction_x[i][j], un_correction_y[i][j] = calc_val((i, j))
        # print((i, j), calc_val((j, i)))

dstMap1, dstMap2 = cv2.convertMaps(un_correction_x.astype(np.float32), un_correction_y.astype(np.float32), cv2.CV_32FC2)
im_un_undistorted = cv2.remap(im_undistorted, dstMap1, dstMap2, cv2.INTER_LANCZOS4)

Here's an implementation of @wcochran 's answer. I was trying to recover a lens correction resulted by lensfunpy.

mod = lensfunpy.Modifier(lens, cam.crop_factor, width, height)
mod.initialize(focal_length, aperture, distance)

undist_coords = mod.apply_geometry_distortion()

## the lens correction part
# im_undistorted = cv2.remap(im, undist_coords, None, cv2.INTER_CUBIC)

# im_undistorted = cv2.remap(im, undist_coords, None, cv2.INTER_LANCZOS4)
# cv2.imwrite(undistorted_image_path, im_undistorted)
undist_coords_f = undist_coords.reshape((-1, 2))
tree = KDTree(undist_coords_f)
def calc_val(point_pos):
    nearest_dist, nearest_ind = tree.query([point_pos], k=5)
    if nearest_dist[0][0] == 0:
        return undist_coords_f[nearest_ind[0][0]]
    # starts inverse distance weighting
    w = np.array([1.0 / pow(d, 2) for d in nearest_dist])
    sw = np.sum(w)
    # embed()
    x_arr = np.floor(nearest_ind[0] / 1080)
    y_arr = (nearest_ind[0] % 1080)
    xx = np.sum(w * x_arr) / sw
    yy = np.sum(w * y_arr) / sw
    return (xx, yy)

un_correction_x = np.zeros((720, 1080))
un_correction_y = np.zeros((720, 1080))

## reverse the lens correction
for i in range(720):
    print("row %d operating" % i)
    for j in range(1080):
        un_correction_x[i][j], un_correction_y[i][j] = calc_val((i, j))
        # print((i, j), calc_val((j, i)))

dstMap1, dstMap2 = cv2.convertMaps(un_correction_x.astype(np.float32), un_correction_y.astype(np.float32), cv2.CV_32FC2)
im_un_undistorted = cv2.remap(im_undistorted, dstMap1, dstMap2, cv2.INTER_LANCZOS4)
裸钻 2025-02-08 23:07:06

knnregressor具有颠倒网格映射的所有必要组件!

干得好:

from sklearn.neighbors import KNeighborsRegressor

def get_inverse_maps(map1, map2):
    regressor = KNeighborsRegressor(3)
    X = np.concatenate((map2[..., None], map1[..., None]), axis=-1).reshape(-1, 2)
    y = np.indices(map1.shape).transpose((1, 2, 0)).reshape(-1, 2)
    regressor.fit(X, y)
    map_inv = regressor.predict(y).reshape(map1.shape + (2,)).astype(np.float32)
    map_inv2, map_inv1 = map_inv[..., 0], map_inv[..., 1]
    return map_inv1, map_inv2

A KNNRegressor has all the necessary components to invert the grid mapping!

Here you go:

from sklearn.neighbors import KNeighborsRegressor

def get_inverse_maps(map1, map2):
    regressor = KNeighborsRegressor(3)
    X = np.concatenate((map2[..., None], map1[..., None]), axis=-1).reshape(-1, 2)
    y = np.indices(map1.shape).transpose((1, 2, 0)).reshape(-1, 2)
    regressor.fit(X, y)
    map_inv = regressor.predict(y).reshape(map1.shape + (2,)).astype(np.float32)
    map_inv2, map_inv1 = map_inv[..., 0], map_inv[..., 1]
    return map_inv1, map_inv2
等待圉鍢 2025-02-08 23:07:06

barycentric重新采样解决方案

我使用。我用numba优化了它。结果是快速且具有弹性的对变形,旋转,缩放,对称性和缩放的结果。

import math

import numba
import numpy as np
import numpy.typing as npt


@numba.njit()
def vertex_index_buffer(h: int, w: int) -> npt.NDArray:
    """
    Each quad formed by 4 points can be split up in 2 triangles.
    returns a 2D array of height=(h-1)*(w-1)*2 and width 3. Each row corresponds to a triangle
    """
    N = (h - 1) * (w - 1) * 2
    n = 0
    triangle_vib = np.empty((N, 3), dtype=np.int32)

    # for each quadritlateral
    for y in range(h - 1):
        for x in range(w - 1):
            # indexes of the 4 points
            ind0 = y * w + x
            ind1 = ind0 + 1
            ind2 = ind0 + w
            ind3 = ind2 + 1

            # fill 2 triangles
            triangle_vib[n, :] = ind0, ind1, ind2
            triangle_vib[n + 1, :] = ind2, ind1, ind3
            n += 2
    return triangle_vib


@numba.jit(nopython=True)
def invert_map(xmap: npt.NDArray, ymap: npt.NDArray) -> tuple[npt.NDArray, npt.NDArray]:
    h, w = xmap.shape
    xmap_inv = np.zeros_like(xmap) - 1
    ymap_inv = np.zeros_like(ymap) - 1

    triangle_vib = vertex_index_buffer(h, w)

    # for each triangle
    for k0, k1, k2 in triangle_vib:
        # get xy forrdinates of the triangles vertices
        x0 = xmap.ravel()[k0]
        x1 = xmap.ravel()[k1]
        x2 = xmap.ravel()[k2]

        y0 = ymap.ravel()[k0]
        y1 = ymap.ravel()[k1]
        y2 = ymap.ravel()[k2]

        # barycentric coordinates
        dy21 = y1 - y2
        dx20 = x0 - x2
        dx12 = x2 - x1
        dy20 = y0 - y2

        norm = dy21 * dx20 + dx12 * dy20

        i0 = k0 // w
        i1 = k1 // w
        i2 = k2 // w

        j0 = k0 % w
        j1 = k1 % w
        j2 = k2 % w

        # search area (rectangle surrounding current triangle)
        xmin = int(math.floor(min(x0, x1, x2)))
        ymin = int(math.floor(min(y0, y1, y2)))
        xmax = int(math.ceil(max(x0, x1, x2)))
        ymax = int(math.ceil(max(y0, y1, y2)))

        xmin = min(max(0, xmin), w - 1)
        ymin = min(max(0, ymin), h - 1)
        xmax = min(max(0, xmax), w - 1)
        ymax = min(max(0, ymax), h - 1)

        if abs(norm) <= 0.01:
            xmap_inv[ymin:ymax, xmin:xmax] = j0
            ymap_inv[ymin:ymax, xmin:xmax] = i0
            continue

        for px in range(xmin, xmax):
            pwx0 = dy21 * (px - x2)
            pwx1 = -dy20 * (px - x2)
            for py in range(ymin, ymax):
                # compute normalized weights of barycentric coordinates. Sum of weights must be 1
                w0 = (pwx0 + dx12 * (py - y2)) / norm
                w1 = (pwx1 + dx20 * (py - y2)) / norm
                w2 = 1 - w0 - w1

                # barycentric interpolation
                xmap_inv[py, px] = (j0 * w0 + j1 * w1 + j2 * w2)
                ymap_inv[py, px] = (i0 * w0 + i1 * w1 + i2 * w2)

    return xmap_inv, ymap_inv

我将其与迭代解决方案提议。 迭代溶液比我的更快,但在旋转和对称性下失败。

基准测试

我已经创建了一个repo invert_map 用于基准我尝试过的不同算法。随意添加您的。

Barycentric Resampling Solution

I used a resampling approach with barycentric interpolation. I optimized it with numba. The result is fast and resilient to distortion, rotation, scaling, symmetry and zoom.

import math

import numba
import numpy as np
import numpy.typing as npt


@numba.njit()
def vertex_index_buffer(h: int, w: int) -> npt.NDArray:
    """
    Each quad formed by 4 points can be split up in 2 triangles.
    returns a 2D array of height=(h-1)*(w-1)*2 and width 3. Each row corresponds to a triangle
    """
    N = (h - 1) * (w - 1) * 2
    n = 0
    triangle_vib = np.empty((N, 3), dtype=np.int32)

    # for each quadritlateral
    for y in range(h - 1):
        for x in range(w - 1):
            # indexes of the 4 points
            ind0 = y * w + x
            ind1 = ind0 + 1
            ind2 = ind0 + w
            ind3 = ind2 + 1

            # fill 2 triangles
            triangle_vib[n, :] = ind0, ind1, ind2
            triangle_vib[n + 1, :] = ind2, ind1, ind3
            n += 2
    return triangle_vib


@numba.jit(nopython=True)
def invert_map(xmap: npt.NDArray, ymap: npt.NDArray) -> tuple[npt.NDArray, npt.NDArray]:
    h, w = xmap.shape
    xmap_inv = np.zeros_like(xmap) - 1
    ymap_inv = np.zeros_like(ymap) - 1

    triangle_vib = vertex_index_buffer(h, w)

    # for each triangle
    for k0, k1, k2 in triangle_vib:
        # get xy forrdinates of the triangles vertices
        x0 = xmap.ravel()[k0]
        x1 = xmap.ravel()[k1]
        x2 = xmap.ravel()[k2]

        y0 = ymap.ravel()[k0]
        y1 = ymap.ravel()[k1]
        y2 = ymap.ravel()[k2]

        # barycentric coordinates
        dy21 = y1 - y2
        dx20 = x0 - x2
        dx12 = x2 - x1
        dy20 = y0 - y2

        norm = dy21 * dx20 + dx12 * dy20

        i0 = k0 // w
        i1 = k1 // w
        i2 = k2 // w

        j0 = k0 % w
        j1 = k1 % w
        j2 = k2 % w

        # search area (rectangle surrounding current triangle)
        xmin = int(math.floor(min(x0, x1, x2)))
        ymin = int(math.floor(min(y0, y1, y2)))
        xmax = int(math.ceil(max(x0, x1, x2)))
        ymax = int(math.ceil(max(y0, y1, y2)))

        xmin = min(max(0, xmin), w - 1)
        ymin = min(max(0, ymin), h - 1)
        xmax = min(max(0, xmax), w - 1)
        ymax = min(max(0, ymax), h - 1)

        if abs(norm) <= 0.01:
            xmap_inv[ymin:ymax, xmin:xmax] = j0
            ymap_inv[ymin:ymax, xmin:xmax] = i0
            continue

        for px in range(xmin, xmax):
            pwx0 = dy21 * (px - x2)
            pwx1 = -dy20 * (px - x2)
            for py in range(ymin, ymax):
                # compute normalized weights of barycentric coordinates. Sum of weights must be 1
                w0 = (pwx0 + dx12 * (py - y2)) / norm
                w1 = (pwx1 + dx20 * (py - y2)) / norm
                w2 = 1 - w0 - w1

                # barycentric interpolation
                xmap_inv[py, px] = (j0 * w0 + j1 * w1 + j2 * w2)
                ymap_inv[py, px] = (i0 * w0 + i1 * w1 + i2 * w2)

    return xmap_inv, ymap_inv

I compared it with the Iterative solution proposed by Hannesh. The Iterative solution is faster than mine but it fails under rotation and symmetry.

Benchmark

I've created a repo invert_map for benchmarking the different algo I tried. Feel free to add yours.

仄言 2025-02-08 23:07:06

没有任何标准方法可以使用 openCV

如果您正在寻找一个完整的现成解决方案,我不确定自己可以提供帮助,但是我至少可以描述几年前完成此任务的方法。

首先,您应该创建与源图像相同的维度的重新映射。我创建了具有较大尺寸的地图,以简化插值,最后一步将它们裁剪成适当的尺寸。然后,您应该将它们填充在以前的重新映射中存在的值(并不难:仅迭代它们,如果映射坐标x和y在图像的范围内坐着,将它们的行和列作为新的y和x取用,然后放入旧X和Y列和新地图的行)。这是相当简单的解决方案,但是它给出了相当不错的结果。对于完美的一个,您应该使用插值方法和邻居像素插值旧X和Y到整数值。

此后,您应该实际手动重新映射像素颜色,或者用像素坐标完全填充映射,并使用OpenCV中的版本。

您将遇到相当具有挑战性的任务:您应该在空区域内插值像素。换句话说,您应该根据这些距离,将距离距离最接近非零像素坐标并混合颜色(如果重现颜色)或坐标(如果进行完整的地图计算)。实际上,线性插值也不是那么困难,您甚至可以在 remap()实现noreferrer“> opencv github页面。对于NN插值,我会简单得多 - 只需拿最近的邻居的颜色/坐标即可。

最终的任务是推断重新映像区域边界的区域。 OPENCV的算法也可以用作参考。

There is no any standard way to do it with OpenCV.

If you are looking for a complete ready-to-use solution, I am not sure that I can help, but I can at least describe a method that I used some years ago to do this task.

First of all, you should create remapping maps with the same dimension as your source image. I created maps with larger dimensions for simpler interpolation, and at final step cropped them to proper size. Then you should fill them with values existing in previous remapping maps (not so difficult: just iterate over them and if maps coordinates x and y lays in limits of your image, take their row and column as new y and x, and place into old x and y column and row of the new map). It is rather simple solution,but it gives rather good result. For perfect one you should interpolate old x and y to integer values using your interpolation method and neighbour pixels.

After this you should either actually remap pixel colors manually, or completely fill your remapping map with pixel coordinates and use version from OpenCV.

You will meet rather challenging task: you should interpolate pixels in empty areas. In other words, you should take distances to closest non-zero pixel coordinates and mix color (if you remap colors) or coordinates (if you proceed with full maps computation) fractions according to these distances. Actually it is also not so difficult for linear interpolation, and you can even look into remap() implementation in OpenCV github page. For NN interpolation it will me much simpler - just take color/coordinate of nearest neighbour.

And a final task is extrapolation of areas out of borders of remapped pixels area. Also algorithm from OpenCV can be used as a reference.

奈何桥上唱咆哮 2025-02-08 23:07:06

OP在这里。我想我找到了答案。我尚未实施它,如果有人提出了一个不太贴心的解决方案(或发现这个问题有问题),我将选择他们的答案。

问题语句

让A为源图像,B为目标图像,M是从A的坐标到B的坐标的映射,即:

B[k, l, :] == A(M[k, l, 0], M[k, l, 1], :) 
for all k, l in B's coords.

...方形支架指示带有整数索引的数组查找,圆形括号指示双线性插值查找。浮点数。我们使用更经济的符号来重述上述:

B = A(M)

我们希望找到一个逆向映射n,将b映射到尽可能最好:

Find N s.t. A \approx B(N)

可以在不参考a或b的情况下说出问题:

Find N = argmin_N || M(N) - I_n ||

... i_n 是具有与n相同的尺寸的身份映射

I_n[i, j, :] == [i, j]
for all i, j

|*|| 指示frobenius norm , 同构,然后您可以直接构建n为:

N[M[k, l, 0], M[k, l, 1], :] = [k, l]
for all k, l

或在我们简化的符号中:

N[M] = I_m

  1. ... i_m是具有与M相同的尺寸的身份 在n [i,j,:]的n n [i,j]中,孔“在M的值中,
  2. M的值是浮点坐标[i,j],而不是整数坐标。对于float值i,j,我们不能简单地将值分配给双线性间隔数n(i,j,:)。为了达到同等效果,我们必须设置[i,j]的四个周围拐角的值[地板(i),地板(j),:],n [floor(i),ceil(j), :],n [ceil(i),地板(j),:],n [ceil(i),ceil(j),:],使得插值n(i,j,:)等于所需的值[ k,l],用于所有像素映射[i,j] - &gt; [k,l]在M中。

解决方案

构建空N作为浮子的3D张量:

N = zeros(size=(A.shape[0], A.shape[1], 2))

对于A的坐标空间中的每个坐标[i,j],请:

  1. 在m中找到A-Coordinate的2x2网格[I,J]躺在里面。
    计算将这些A坐标映射到相应的B坐标物(由2x2 Grid的像素索引给出的)的同型矩阵h。
  2. 集n [i,j,:] = matmul(h,[i,j])

可能昂贵的步骤是第1步中的搜索,该搜索是对环绕[i,j]的M中A坐标的2x2网格。蛮力搜索将使整个算法o(n*m),其中n是a中的像素的数量,而m中的像素的数量。

要将其减少到o(n),可以运行扫描线每个A坐标四边形内的算法以识别所有整数值坐标[i,j]。这可以预先计算为一个标志性,该哈希图将整数值映射为坐标[i,j]到包围四边形的B坐标的左上角[k,l]。

OP here. I think I've found an answer. I haven't implemented it yet, and if someone comes up with a less fiddly solution (or finds something wrong with this one), I'll choose their answer instead.

Problem statement

Let A be the source image, B be the destination image, and M be the mapping from A's coords to B's coords, i.e.:

B[k, l, :] == A(M[k, l, 0], M[k, l, 1], :) 
for all k, l in B's coords.

...where square braces indicate array lookup with integer indices, and circular braces indicate bilinear interpolation lookup with floating-point indices. We restate the above using the more economical notation:

B = A(M)

We wish to find an inverse mapping N that maps B back to A as best as is possible:

Find N s.t. A \approx B(N)

The problem can be stated without reference to A or B:

Find N = argmin_N || M(N) - I_n ||

...where ||*|| indicates the Frobenius norm, and I_n is the identity map with the same dimensions as N, i.e. a map where:

I_n[i, j, :] == [i, j]
for all i, j

Naive solution

If M's values are all integers, and M is an isomorphism, then you can construct N directly as:

N[M[k, l, 0], M[k, l, 1], :] = [k, l]
for all k, l

Or in our simplified notation:

N[M] = I_m

...where I_m is the identity map with the same dimensions as M.

There are two problems:

  1. M is not an isomorphism, so the above will leave "holes" in N at N[i, j, :] for any [i, j] not among the values in M.
  2. M's values are floating-point coordinates [i, j], not integer coordinates. We cannot simply assign a value to the bilinearly-interpolated quantity N(i, j, :), for float-valued i, j. To achieve the equivalent effect, we must instead set the values of [i, j]'s four surrounding corners N[floor(i), floor(j), :], N[floor(i), ceil(j), :], N[ceil(i), floor(j), :], N[ceil(i), ceil(j), :] such that the interpolated value N(i, j, :) equals the desired value [k, l], for all pixel mappings [i, j] --> [k, l] in M.

Solution

Construct empty N as a 3D tensor of floats:

N = zeros(size=(A.shape[0], A.shape[1], 2))

For each coordinate [i, j] in A's coordinate space, do:

  1. Find the 2x2 grid of A-coordinates in M that [i, j] lies within.
    Compute the homography matrix H that maps those A-coordinates to their corresponding B-coordinates (given by the 2x2 grid's pixel indices).
  2. Set N[i, j, :] = matmul(H, [i, j])

The potentially expensive step here would be the search in step 1 for the 2x2 grid of A-coordinates in M that encircles [i, j]. A brute-force search would make this whole algorithm O(n*m) where n is the number of pixels in A, and m the number of pixels in B.

To reduce this to O(n), one could instead run a scanline algorithm within each A-coordinate quadrilateral to identify all the integer-valued coordinates [i, j] it contains. This could be precomputed as a hashmap that maps integer-valued A coords [i, j] to the upper-left corner of its encircling quadrilateral's B coords [k, l].

指尖上得阳光 2025-02-08 23:07:06

一种方法是拍摄原始地图,遍历其条目,并占据X和Y值的地板和天花板。这给出了(x,y)(x f ,y f )的四个最近整数f ),(x f ,y c )和(x c ,y c )在原始源图像的坐标中。然后,您可以将它们用作包含像素值和权重的索引填充一个结构,并将您首选的不规则网格插值与这些数据一起使用。

由于结构可以是图像阵列的积累,而权重为标量,因此这很容易实现。 F是原始源,G是扭曲的图像,F'是恢复的图像。该地图为M.

Init f'至0。创建一个与F'相同的浮子的0个式重量阵列W。

在M中迭代M,在M中的每个M中找到4个整数对及其距离(x,y)的距离。从g中取相应的像素值,将其按相互距离加权,然后将其累积到f'like

f'(xf | c,yf |​​ c)+= g(i,j)/sqrt((x--) xf | c)^2+(y-yf | c)^2)

然后将重量累积到

w(xf | c,yf |​​ c)+= 1./sqrt((x-xf | c)^2+(y-yf | c)^2)

完成之后,通过迭代并将每个像素除以W中的对应条目(如果不是零)来标准化f'。

在这一点上,图像通常几乎完整,但是在较高的下采样比下,f'中的某些像素可能不会填充。因此,您会在W中来回往返W以找到0个重量条目,并插入这些像素来自他们的非空邻居。这部分可以通过KNN搜索和插值来完成,因为它们通常不多。

它易于实现,比KNN方法更好得多(尽管我认为这对于小图像很棒)。不利的一面是,反距离不是最大的插值方案,但是如果映射不太笨拙并且原始采样不多,它似乎可以很好地工作。当然,如果下样本比率很高,则必须推断出大量丢失的信息,因此它本质上会带来粗略的结果。

如果您想尽可能多地从地图倒置中挤出,可以尝试解决原始插值方案定义的方程式(可能不确定的)系统;不是不可能的,但是具有挑战性。

One way to do it is to take the original map, iterate through its entries and take floors and ceils of the x and y values. This gives the four nearest integers around (x,y), (xf,yf), (xc,yf), (xf,yc), and (xc,yc) in the coordinates of the original source image. You can then fill in a structure with each of these as an index which contains the pixel value and a weight, and use your preferred irregular grid interpolation with those data.

This is easy to implement with inverse distance interpolation, since the structure can be an image array accumulation and the weights are scalars. F is the original source, G is the warped image, and F' is the restored image. The map is M.

Init F' to 0. Create a 0-initialized weight array W of floats the same size as F'.

Iterate through M. For each in M, find the 4 integer pairs and their distances from (x,y). Take the corresponding pixel value from G, weight it by its reciprocal distance, and accumulate it into F' like

F'(xf|c,yf|c)+=G(i,j)/sqrt((x-xf|c)^2+(y-yf|c)^2)

Then accumulate that weight into

W(xf|c,yf|c)+=1./sqrt((x-xf|c)^2+(y-yf|c)^2).

After that is finished, normalize F' by iterating through it and divide each pixel by its corresponding entry in W, if it's non zero.

At this point, the image is usually nearly complete, but with high downsampling ratios, some pixels in F' may not get filled in. So then you do a couple passes back and forth through W to find 0 weight entries, and interpolate those pixels from their non-empty neighbors. This part could be done with KNN search and interpolate too since there usually aren't many of them.

It's easy to implement and scales a lot better than the KNN approach (though I think that's great for small images). The downside is that inverse distance isn't the greatest interpolation scheme, but it seems to work fairly well if the mapping isn't too clumpy and the original hasn't been downsampled a lot. Of course, if the downsample ratio is high, you're having to infer a lot of lost information, so it's inherently going to give rough results.

If you want to squeeze as much as possible out of the map inversion, you could try to solve the (potentially underdetermined) system of equations defined by the original interpolation scheme; not impossible, but challenging.

浪推晚风 2025-02-08 23:07:06

好吧,为了使扭曲的图像从Unsustort中获取,也许您可​​以使用OpENCV的UnsctortPoint函数获取反向映射。使用initundistortRectifyMap您从testort-&gt; untistort中获得映射,并使用UnnestortPoints,您可以从Untistort-&gt; toint点上获取映射,然后使用重新映射来获取扭曲图像。

Well, to get the distort image from undistort, maybe you can use undistortPoints function of opencv to get reverse map. Use initUndistortRectifyMap you get map from distort->undistort, and use undistortPoints, you can get map from undistort->distort points by points, then use remap to get the distort image.

风流物 2025-02-08 23:07:06

解决方案 https://stackoverflow.com/a/68706787/4521113 很棒,但是我对所提供的说明不满意。在这里,我将在该解决方案,我认为它的假设以及这些假设产生的局限性上做出解释。

介绍性问题

假设我们有一个函数 f(x),我们想获得一个值 x ,该值 f(x)= y 。例如,假设 f(x)=x²,我们想找到产生 x 的值 x ,该值 f(x)= 4 。对于这种具体情况,我们可以倒转功能并使用 x =f⁻⁻(y)= sqrt(y),所以f⁻⁻(4)= sqrt(4)= 2 ,这为我们提供了解决方案。

但是,一个函数并不总是可逆的,或者发现逆可能是不平凡的。在这种情况下,我们可以将问题重新定义为最小化问题。让我们定义损失函数 c(x)=(y -f(x))²其中 y 是我们在评估 f(x)后要获得的值)。查找 x f(x)= y ,等效于最小化 c(x)

有很多算法用于查找功能的最低限度。让我们考虑渐变下降解决这个问题,只是因为。在我们的情况下,我们将通过

x_{k+1} = x_k - alpha * dC/dx = x_k + 2 alpha * ( y - f(x) ) * df/dx

将其应用于 x_0 = 1 的特定示例来找到解决方案,

import math

alpha = 5.0e-2
x = 1
for i in range(100):
    x = x + 2 * alpha * ( 4.0 - x**2 ) * (2*x)
    print(x)

我们观察到 x的值如何慢慢接近 2 ,我们知道这是解决最小化问题的解决方案。

梯度下降以将从像素到像素

地图 x [i,j] y [i,j] 的地图被认为是从r²到r的功能,产生一个函数 f =(x(i,j),y(i,j))从r²到r²;它映射原始图像中的像素坐标为目标图像中的像素坐标。反转地图等同于查找 f⁻=(x⁻⁻(i',j'),y⁻⁻(i',j'))将目标图像中的像素坐标映射到像素在原始图像中坐标。同样,可以通过定义成本函数 c(i,j)= ||来重新重新重新重新重新重新构成最小化问题。 (i',j') - (x(i,j),y(i,j))||²。同样,我们可以在原始图像中使用梯度下降来查找坐标(i,j),这些图像映射到坐标(i',j') in目标图像:

(i,j)_{k+1} = (i,j)_k + 2 alpha [ (i',j') - ( X(i,j) , Y(i,j) ) ] * J

其中 j 是地图的雅各布:

J = [ dX(i,j)/di  dX(i,j)/dj ]
    [ dY(i,j)/di  dY(i,j)/dj ]

我们假设(i,j)(i',j')是2D矢量。这开始类似于上述解决方案。

假设

有多种梯度下降的变体。他们中的一些人使用一些“共轭方向”。一个与梯度不同的方向,但也导致最小值。 Hannesh提出的解决方案用身份矩阵代替Jacobian J。因此,假设是身份矩阵乘以比例因子 alpha'是雅各布时代的有效近似值,是“学习率” alpha: 2 * alpha * alpha * j 由 alpha' * i 近似。

最后,选择 alpha'为1。

在我们获得的迭代算法中介绍这些更改:

(i,j)_{k+1} = (i,j)_k + [ (i',j') - ( X(i,j) , Y(i,j) ) ]

现在,我们可以构建目标索引的图像(i',j'),并近似评估。 (x(i,j),y(i,j))通过使用 remap 函数。这将产生提出的算法的最终版本。

我实施了实用的考虑因素

,以近似相机校准函数的反图。该相机校准功能在原始扭曲的图像中采用坐标:

并将它们转换为未置换的图像中的坐标。在这里,您可以看到按像素应用迭代算法像素的结果,并且不使用 Remap 函数,但使用确切的映射评估当前解决方案:

解决方案是近似值

算法提供的结果是一个近似值,因为 Remap 函数在当前解决方案下对地图的评估提供了近似值。在这里,您可以看到使用 Remap 函数应用算法的结果, alpha'= 1e-2 的学习率以及迭代1000次:
2_1000_linear“
注意图像边界上的伪影,以及右下角缺少映射。使用 Inter_linear 选择了选择的插值方法,但是使用 Inter_cubic 也没有真正帮助:

选择一个

使用精确映射计算的倒数映射是使用 alpha'= 5E-1 和50次迭代的学习速率获得的。但是,在最终结果中选择学习率不当也可能导致工件。

在此处查找使用 alpha'= 1E0 和50个迭代的结果:

注意图像角中获得的伪影。这是由于学习率太大而不会融合算法的结果。

另一方面,检查使用 alpha'= 1e-2 和50迭代的结果:

请注意,“未置入的”图像并未完全未完全突出,并且“直线”线仍然弯曲。这是由于学习率太小而不会融合算法的结果。

Solution https://stackoverflow.com/a/68706787/4521113 is great, but I was not satisfied with the provided explanation. Here I will contribute my interpretation on that solution, the assumptions I think it makes, and the limitations that arise from those assumptions.

Introductory problem

Assume we have a function f(x) and we want to obtain a value x that produces f(x) = y. As an example, assume f(x) = x² and we want to find the value x that produces f(x) = 4. For this concrete case, we can invert the function and use x = f⁻¹(y) = sqrt(y), so f⁻¹(4) = sqrt(4) = 2, which gives us the solution.

However, a function is not always invertible, or finding the inverse could be non-trivial. In such cases, we can redefine the problem as a minimization problem. Let's define the loss function C(x) = ( y - f(x) )² where y is the value we want to obtain after evaluating f(x). Finding x for which f(x) = y, is equivalent to minimizing C(x).

There are plenty of algorithms used to find the minimum of a function. Let's consider Gradient descent to solve this problem, just because. In our case, we would iterate on x to find the solution through

x_{k+1} = x_k - alpha * dC/dx = x_k + 2 alpha * ( y - f(x) ) * df/dx

Applying this to our particular example starting from x_0 = 1,

import math

alpha = 5.0e-2
x = 1
for i in range(100):
    x = x + 2 * alpha * ( 4.0 - x**2 ) * (2*x)
    print(x)

we observe how the value of x slowly approaches 2, which we know to be the solution to our minimization problem.

Gradient descent to invert a map from pixel to pixel

Maps X[i,j] and Y[i,j] can be thought as functions from R² to R, which combined produce a function F = ( X(i,j) , Y(i,j) ) from R² to R²; it maps pixel coordinates in the original image to pixel coordinates in the target image. Inverting the map is equivalent to find F⁻¹ = ( X⁻¹(i',j') , Y⁻¹(i',j') ) that maps pixel coordinates in the target image to pixel coordinates in the original image. And again, this problem can be reformulated as a minimization problem by defining the cost function C(i,j) = || (i',j') - ( X(i,j) , Y(i,j) ) ||². And again, we can iterate using gradient descent to find the coordinates (i,j) in the original image that are mapped to the coordinates (i',j') in the target image:

(i,j)_{k+1} = (i,j)_k + 2 alpha [ (i',j') - ( X(i,j) , Y(i,j) ) ] * J

where J is the Jacobian of the map:

J = [ dX(i,j)/di  dX(i,j)/dj ]
    [ dY(i,j)/di  dY(i,j)/dj ]

and we assume (i,j) and (i',j') to be 2d row vectors. This starts to resemble the solution mentioned above.

Assumptions

There are multiple variants of Gradient descent. Some of them use some "conjugate direction"; a direction different from the gradient, but that also leads to the minimum. The solution proposed by Hannesh substitutes the Jacobian J by the identity matrix. Hence, the assumption is that the identity matrix times a scale factor alpha' is a valid approximation for the Jacobian times 2 times the "learning rate" alpha: 2 * alpha * J is approximated by alpha' * I.

Finally, alpha' is chosen to be 1.

Introducing these changes in the iterative algorithms we obtain:

(i,j)_{k+1} = (i,j)_k + [ (i',j') - ( X(i,j) , Y(i,j) ) ]

Now, we can build an image of target indices (i',j'), and approximate the evaluation of ( X(i,j) , Y(i,j) ) by using the remap function. That would yield the final version of the proposed algorithm.

Practical considerations

I implemented the solution to approximate the inverse map of my camera calibration function. That camera calibration function takes coordinates in the original distorted image:
original
and transforms them into coordinates in the undistorted image. Here you can see the result of applying the iterative algorithm pixel by pixel, and WITHOUT using the remap function but evaluating the current solution with the exact map:
undistorted_5e-1_50_exact

The solution is an approximation

The result provided by the algorithm is an approximation, because the remap function provides an approximation to the evaluation of the map at the current solution. Here you can see the result of applying the algorithm using the remap function, a learning rate of alpha'=1e-2, and iterating 1000 times:
undistorted_approx_1e-2_1000_linear
Note the artifacts on the borders of the image, and the lack of mapping in the right bottom corner. The chosen interpolation method was selected using INTER_LINEAR, but using INTER_CUBIC does not really help either:
undistorted_approx_1e-2_1000_cubic

Choosing a too small or too big learning-rate

The inverse map computed using the exact map was obtained using a learning rate of alpha'=5e-1, and 50 iterations. However, choosing the learning rate inappropriately can also lead to artifacts in the final result.

Find here the result of using alpha'=1e0 and 50 iterations:
undistorted_1e0_50_exact
Note the artifacts obtained in the corners of the image. That is the consequence of the algorithm not converging because of a too big learning rate.

On the other hand, check the result of using alpha'=1e-2 and 50 iterations:
undistorted_1e-2_50_exact
Note how the "undistorted" image is not totally undistorted, and "straight" lines are still curved. That is the consequence of the algorithm not converging because of a too small learning rate.

站稳脚跟 2025-02-08 23:07:06

据我了解,您有原始图像和一个变换的图像,并且希望恢复不知道它已应用的转换的性质,但是假设它是明智的,例如旋转或鱼眼扭曲。

我要尝试的是在索引图像和纯图像中使用阈值将图像转换为二进制。然后尝试识别对象。大多数映射至少将保留连接性和Euler号码,索引中最大的对象仍然是平原上最大的对象。

然后为您匹配的图像 /索引对介绍瞬间,看看是否可以删除翻译,旋转和缩放。这为您提供了几个反向地图,然后您可以尝试将其缝合在一起。 (如果转换不是简单的话,但是重新建立任何转换的总体问题无法解决)。

From what I understand you have an original image, and a transformed image, and you wish to recover the nature of the transform that has been applied without knowing it, but assuming it is something sensible, like a rotation or a fish-eye distort.

What I would try is thresholding the image to convert it to binary, in both the index image and the plain image. Then try to identify objects. Most mappings will at least retain connectivity and Euler number, mostly the largest object in the index will still be the largest object in the plain.

Then take moments for your matched image / indexed pairs and see if you can remove translation, rotation and scaling. That gives you several reverse maps, which you can then try to stitch together. (Hard if the transform is not simple, but the general problem of reconstituting just any transformation cannot be solved).

无所的.畏惧 2025-02-08 23:07:06

使用KDTREE和反距离加权(IDW)

Use KDTree and Inverse Distance Weighting (IDW)
enter image description here

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文