填充 scipy affine_transform 输出以显示变换图像的非重叠区域
我有源 (src
) 图像,我希望使用仿射变换对齐到目标 (dst
) 图像,同时在对齐过程中保留两个图像的完整范围 (甚至非重叠区域)。
我已经能够计算仿射变换旋转和偏移矩阵,我将其提供给 scipy.ndimage.interpolate.affine_transform
恢复 dst
对齐的 src
图像。
问题在于,当图像未完全重叠时,生成的图像将被裁剪为仅两个图像的公共足迹。我需要的是放置在同一像素坐标系上的两个图像的完整范围。这个问题几乎是这个问题的重复 - 并且那里的优秀答案和存储库为 OpenCV 转换提供了此功能。不幸的是,我需要这个来实现 scipy 的实现。
太晚了,在反复尝试将上述问题的答案翻译为 scipy 后,我遇到了 此问题,随后关注这个问题 。后一个问题确实让我们对 scipy 仿射变换的奇妙世界有了一些了解,但我还无法满足我的特殊需求。
从src
到dst
的转换可以有平移和旋转。我只能让翻译工作(下面显示了一个示例),并且我只能让旋转工作(主要是围绕下面的内容进行黑客攻击,并从使用 reshape
参数rel="noreferrer">scipy.ndimage.interpolation.rotate
)。然而,将两者结合起来我彻底迷失了。我试图计算正确的偏移量(参见这个问题的答案再次),但我无法让它在所有情况下工作。
填充仿射变换的仅翻译工作示例,主要遵循此存储库,在 这个答案:
from scipy.ndimage import rotate, affine_transform
import numpy as np
import matplotlib.pyplot as plt
nblob = 50
shape = (200, 100)
buffered_shape = (300, 200) # buffer for rotation and translation
def affine_test(angle=0, translate=(0, 0)):
np.random.seed(42)
# Maxiumum translation allowed is half difference between shape and buffered_shape
# Generate a buffered_shape-sized base image with random blobs
base = np.zeros(buffered_shape, dtype=np.float32)
random_locs = np.random.choice(np.arange(2, buffered_shape[0] - 2), nblob * 2, replace=False)
i = random_locs[:nblob]
j = random_locs[nblob:]
for k, (_i, _j) in enumerate(zip(i, j)):
# Use different values, just to make it easier to distinguish blobs
base[_i - 2 : _i + 2, _j - 2 : _j + 2] = k + 10
# Impose a rotation and translation on source
src = rotate(base, angle, reshape=False, order=1, mode="constant")
bsc = (np.array(buffered_shape) / 2).astype(int)
sc = (np.array(shape) / 2).astype(int)
src = src[
bsc[0] - sc[0] + translate[0] : bsc[0] + sc[0] + translate[0],
bsc[1] - sc[1] + translate[1] : bsc[1] + sc[1] + translate[1],
]
# Cut-out destination from the centre of the base image
dst = base[bsc[0] - sc[0] : bsc[0] + sc[0], bsc[1] - sc[1] : bsc[1] + sc[1]]
src_y, src_x = src.shape
def get_matrix_offset(centre, angle, scale):
"""Follows OpenCV.getRotationMatrix2D"""
angle = angle * np.pi / 180
alpha = scale * np.cos(angle)
beta = scale * np.sin(angle)
return (
np.array([[alpha, beta], [-beta, alpha]]),
np.array(
[
(1 - alpha) * centre[0] - beta * centre[1],
beta * centre[0] + (1 - alpha) * centre[1],
]
),
)
# Obtain the rotation matrix and offset that describes the transformation
# between src and dst
matrix, offset = get_matrix_offset(np.array([src_y / 2, src_x / 2]), angle, 1)
offset = offset - translate
# Determine the outer bounds of the new image
lin_pts = np.array([[0, src_x, src_x, 0], [0, 0, src_y, src_y]])
transf_lin_pts = np.dot(matrix.T, lin_pts) - offset[::-1].reshape(2, 1)
# Find min and max bounds of the transformed image
min_x = np.floor(np.min(transf_lin_pts[0])).astype(int)
min_y = np.floor(np.min(transf_lin_pts[1])).astype(int)
max_x = np.ceil(np.max(transf_lin_pts[0])).astype(int)
max_y = np.ceil(np.max(transf_lin_pts[1])).astype(int)
# Add translation to the transformation matrix to shift to positive values
anchor_x, anchor_y = 0, 0
if min_x < 0:
anchor_x = -min_x
if min_y < 0:
anchor_y = -min_y
shifted_offset = offset - np.dot(matrix, [anchor_y, anchor_x])
# Create padded destination image
dst_h, dst_w = dst.shape[:2]
pad_widths = [anchor_y, max(max_y, dst_h) - dst_h, anchor_x, max(max_x, dst_w) - dst_w]
dst_padded = np.pad(
dst,
((pad_widths[0], pad_widths[1]), (pad_widths[2], pad_widths[3])),
"constant",
constant_values=-1,
)
dst_pad_h, dst_pad_w = dst_padded.shape
# Create the aligned and padded source image
source_aligned = affine_transform(
src,
matrix.T,
offset=shifted_offset,
output_shape=(dst_pad_h, dst_pad_w),
order=3,
mode="constant",
cval=-1,
)
# Plot the images
fig, axes = plt.subplots(1, 4, figsize=(10, 5), sharex=True, sharey=True)
axes[0].imshow(src, cmap="viridis", vmin=-1, vmax=nblob)
axes[0].set_title("Source")
axes[1].imshow(dst, cmap="viridis", vmin=-1, vmax=nblob)
axes[1].set_title("Dest")
axes[2].imshow(source_aligned, cmap="viridis", vmin=-1, vmax=nblob)
axes[2].set_title("Source aligned to Dest padded")
axes[3].imshow(dst_padded, cmap="viridis", vmin=-1, vmax=nblob)
axes[3].set_title("Dest padded")
plt.show()
例如:
affine_test(0, (-20, 40))
给出:
放大显示填充图像中的对齐情况:
我需要在相同像素坐标上对齐的 src
和 dst
图像的完整范围,并进行旋转和平移。
任何帮助都非常感谢!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
复杂性分析
问题是确定三个参数
让我们假设您有一个用于角度、x 和 y 位移的网格,每个网格的大小为
O(n)
并且您的图像的大小为O( nxn)
因此,图像的旋转、平移和比较都需要O(n^2)
,因为您有O(n^3)
候选者尝试转变,最终会变得复杂O(n^5)
,可能这就是您问这个问题的原因。然而,通过使用傅立叶变换计算最大相关性,可以更有效地计算位移部分。傅里叶变换的执行复杂度为每个轴
O(n log n)
,并且我们必须在两个空间维度上执行它们,完整的相关矩阵可以在O(n ^2 log^2 n)
,那么我们找到最大值,复杂度为O(n^2)
,因此确定最佳对齐的总体时间复杂度为O(n ^2 log^2 n)
。但是,您仍然希望搜索最佳角度,因为我们有O(n)
个候选角度,因此该搜索的整体复杂度将为O(n^3 log^2 n)
O(n^3 log^2 n)代码>.请记住,我们使用的是 python,我们可能会产生一些显着的开销,因此这种复杂性只会让我们知道它有多困难,而且我以前处理过这样的问题,所以我开始充满信心。准备一些示例,
我将首先下载图像并应用旋转并将图像填充以零居中。
位移搜索
首先是计算图像的相关性
然后,让我们创建一个没有旋转的例子,并确认使用最大相关性的索引,我们可以找到适合一张图像的位移到另一个。
通过旋转或插值,这个结果可能不准确,但它给出的位移将为我们提供最接近的可能对齐。
让我们将其放入一个函数中以供将来使用
搜索角度
现在我们可以分两步执行某些操作,
第一步我们计算每个角度的相关性,然后使用给出最大相关性的角度找到对齐方式。
让我们看看相关性如何
从这条曲线来看,我们有一个明显的赢家,即使向日葵具有某种旋转对称性。
让我们将转换应用于原始图像,看看它的样子
太好了,我不会比手动做得更好。
出于美观原因,我使用向日葵图像,但任何类型的图像的过程都是相同的。我使用 RGB 显示图像可能有一个额外的维度,即它使用特征向量,而不是标量特征,如果您的特征,您可以使用将数据重塑为
(width, height, 1)
是一个标量。Complexity analysis
The problem is to determine three parameters
Let's suppose that you have a grid for angle, x and y displacements, each with size
O(n)
and that your images are of sizeO(n x n)
so, rotation, translation, and comparison of the images all takeO(n^2)
, since you haveO(n^3)
candidate transforms to try, you end up with complexityO(n^5)
, and probably that's why you are asking the question.However the part of the displacement can be computed slightly more efficiently by computing maximum correlation using Fourier transforms. The Fourier transforms can be performed with complexity
O(n log n)
each axis, and we have to perform them to the two spatial dimensions, the complete correlation matrix can be computed inO(n^2 log^2 n)
, then we find the maximum with complexityO(n^2)
, so the overall time complexity of determining the best alignment isO(n^2 log^2 n)
. However you still want to search for the best angle, since we haveO(n)
candidate angles the overall complexity of this search will beO(n^3 log^2 n)
. Remember we are using python and we may have some significant overhead, so this complexity only gives us an idea of how difficult it will be, and I have handled problems like this before so I start confident.Preparing some example
I will start by downloading an image and applying rotation and centering the image padding with zeros.
The displacement search
The first thing is to compute the correlation of the image
Then, let's create an example without rotation and confirm that the with the index of the maximum correlation we can find the displacement that fit one image to the other.
With rotation or interpolation this result may not be exact but it gives the displacement that will give us the closest possible alignment.
Let's put this in a function for future use
Searching for the angle
Now we can do something in two steps,
in one we compute the correlation for each angle, then with the angle that gives maximum correlation find the alignment.
Let's see how the correlation looks like
We have a clear winner looking at this curve, even if a sunflower has some sort of rotation symmetry.
Let's apply the transformation to the original image and see how it looks like
Great, I wouldn't have done better than this manually.
I am using a sunflower image for beauty reasons, but the procedure is the same for any type of image. I use RGB showing that the image may have one additional dimension, i.e. it uses a feature vector, instead of the scalar feature, you can use reshape your data to
(width, height, 1)
if your feature is a scalar.下面的工作代码以防其他人需要
scipy
的仿射变换:例如运行:
将显示:
并放大:
抱歉,代码写得不好。
请注意,在野外运行它时,我注意到它无法处理图像比例大小的任何变化,但我不确定这与我计算转换的方式无关 - 因此值得注意的警告,并检查,如果您要对齐不同比例的图像。
Working code below in case anyone else has this need of
scipy
's affine transformations:E.g. running:
will show:
and zoomed in:
Apologies the code is not nicely written as is.
Note that running this in the wild I notice it cannot handle any change in scale size of the images, but I am not certain it isn't something to do with how I calculate the transformation - so a caveat worth noting, and checking out, if you are aligning images with different scales.
如果您有两个相似(或相同)的图像并且想要对齐它们,可以使用旋转和移位这两个函数来完成:
您需要首先找到两者之间的角度差图像
angle_to_rotate
,让您应用旋转到src:使用
reshape=True
,您可以避免丢失原始src矩阵中的信息,并且它会填充结果图像可以是围绕 0,0 索引进行翻译。您可以计算此平移,因为它是(x*cos(angle),y*sin(angle)
,其中 x 和 y 是图像的尺寸,但这可能并不重要。现在您需要将图像翻译到源,为此,您可以使用移位函数:
在这种情况下,没有重塑(因为否则您不会有任何真正的翻译)所以如果图像之前没有被填充,一些信息将会被填充但是
您可以使用进行一些填充
要计算
distance_x
和distance_y
,您需要找到一个点作为rotated_src
之间的参考。 > 和目的地,然后计算 x 和 y 轴上的距离Summary
src
和dst 中进行一些填充
。
src
使用 reshape=True 与 scipy.ndimage.rotatedistance_x, distance_y
代码
首先 我们制作目标图像:
我们制作源图像:
然后我们将 src 与目标对齐:
pd:如果您发现以编程方式查找角度和距离时出现问题,请发表评论提供更多关于可用作参考的信息,例如图像框架或某些图像特征/数据)
If you have two images that are similar (or the same) and you want to align them, you can do it using both functions rotate and shift :
You need to find first the difference of angle between the two images
angle_to_rotate
, having that you apply a rotation to src:With
reshape=True
you avoid losing information from your original src matrix, and it pads the result so the image could be translated around the 0,0 indexes. You can calculate this translation as it is(x*cos(angle),y*sin(angle)
where x and y are the dimensions of the image, but it probably won't matter.Now you will need to translate the image to the source, for doing that you can use the shift function:
In this case there is no reshape (because otherwise you wouldn't have any real translation) so if the image was not previously padded some information will be lost.
But you can do some padding with
To calculate
distance_x
anddistance_y
you will need to find a point that serves you as a reference between therotated_src
and the destination, then just calculate the distance in the x and y axis.Summary
src
, anddst
src
with scipy.ndimage.rotate using reshape=Truedistance_x, distance_y
between the rotated image and dstCode
First we make the destination image:
We make the Source image:
Then we align the src to the destination:
pd: If you find problems to find the angle and the distances in a programmatic way, please leave a comment providing a bit more of insight of what can be used as a reference that could be for example the frame of the image or some image features / data)