使用numpy中的as_strided函数滑动窗口？

发布于 2024-12-06 05:53:29 字数 628 浏览 0 评论 0原文

当我使用 python 实现滑动窗口来检测静态图像中的对象时，我开始了解这个不错的功能：

numpy.lib.stride_tricks.as_strided

因此，我尝试实现一个通用规则，以避免在更改我需要的滑动窗口的大小时可能会失败的错误。最后我得到了这个表示：

all_windows = as_strided(x,((x.shape[0] - xsize)/xstep ,(x.shape[1] - ysize)/ystep ,xsize,ysize), (x.strides[0]*xstep,x.strides[1]*ystep,x.strides[0],x.strides[1])

这会产生一个 4 维矩阵。前两个表示图像 x 轴和 y 轴上的窗口数。其他表示窗口的大小 (xsize,ysize)

，step 表示两个连续窗口之间的位移。

如果我选择方形滑动窗口，这种表示法效果很好。但我仍然有一个问题，让它适用于 ex (128,64) 的窗口，我通常会得到与图像无关的数据。

我的代码有什么问题。有什么想法吗？是否有更好的方法可以在 python 中获得漂亮整洁的滑动窗口来进行图像处理？

谢谢

原文

As I get to implement a sliding window using python to detect objects in still images, I get to know the nice function:

numpy.lib.stride_tricks.as_strided

So I tried to achieve a general rule to avoid mistakes I may fail in while changing the size of the sliding windows I need. Finally I got this representation:

all_windows = as_strided(x,((x.shape[0] - xsize)/xstep ,(x.shape[1] - ysize)/ystep ,xsize,ysize), (x.strides[0]*xstep,x.strides[1]*ystep,x.strides[0],x.strides[1])

which results in a 4 dim matrix. The first two represents the number of windows on the x and y axis of the image. and the others represent the size of the window (xsize,ysize)

and the step represents the displacement from between two consecutive windows.

This representation works fine if I choose a squared sliding windows. but still I have a problem in getting this to work for windows of e.x. (128,64), where I get usually unrelated data to the image.

What is wrong my code. Any ideas? and if there is a better way to get a sliding windows nice and neat in python for image processing?

Thanks

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

当梦初醒 2024-12-13 05:53:29

您的代码有问题。实际上，这段代码适用于 2D，没有理由使用多维版本（使用步幅实现高效的移动平均滤波器）。下面是一个固定版本：

A = np.arange(100).reshape((10, 10))
print A
all_windows = as_strided(A, ((A.shape[0] - xsize + 1) / xstep, (A.shape[1] - ysize + 1) / ystep, xsize, ysize),
      (A.strides[0] * xstep, A.strides[1] * ystep, A.strides[0], A.strides[1]))
print all_windows

There is an issue in your code. Actually this code work good for 2D and no reason to use multi dimensional version (Using strides for an efficient moving average filter). Below is a fixed version:

A = np.arange(100).reshape((10, 10))
print A
all_windows = as_strided(A, ((A.shape[0] - xsize + 1) / xstep, (A.shape[1] - ysize + 1) / ystep, xsize, ysize),
      (A.strides[0] * xstep, A.strides[1] * ystep, A.strides[0], A.strides[1]))
print all_windows

回复收藏 0 原文

青巷忧颜 2024-12-13 05:53:29

查看此问题的答案：使用步幅实现高效的移动平均滤波器。基本上，跨步并不是一个好的选择，尽管它们确实有效。

回复收藏 0 原文

两仪 2024-12-13 05:53:29

对于后验性：

这是在 scikit-learn 的 sklearn.feature_extraction.image.extract_patches 函数中实现的。

回复收藏 0 原文

荭秂 2024-12-13 05:53:29

我有一个类似的用例，我需要在一批多通道图像上创建滑动窗口，最终得到以下函数。我写了一篇更深入的博客文章，介绍了有关手动创建卷积层的内容。该函数实现了滑动窗口，还包括对输入数组进行扩展或添加填充。

该函数将输入：

input - Size of (Batch, Channel, Height, Width) output_size - 取决于使用情况，注释如下。 kernel_size - 您希望创建的滑动窗口的大小（方形） padding - 添加到 (H,W) 尺寸外部的 0 填充量 stride - 滑动窗口应接管输入 dilate - 展开输入单元格的量。这会在元素之间添加 0 填充的行/列

通常，在执行前向卷积时，不需要执行扩张，因此可以使用以下公式找到输出大小（将 x 替换为输入维度）：

(x - kernel_size + 2 * padding) // stride + 1

当使用此函数执行卷积的向后传递时，请使用 stride 1 并将您的 output_size 设置为前向传递的 x-输入的大小

包含使用此函数的示例的示例代码可以是成立在此链接。

def getWindows(input, output_size, kernel_size, padding=0, stride=1, dilate=0):
    working_input = input
    working_pad = padding
    # dilate the input if necessary
    if dilate != 0:
        working_input = np.insert(working_input, range(1, input.shape[2]), 0, axis=2)
        working_input = np.insert(working_input, range(1, input.shape[3]), 0, axis=3)

    # pad the input if necessary
    if working_pad != 0:
        working_input = np.pad(working_input, pad_width=((0,), (0,), (working_pad,), (working_pad,)), mode='constant', constant_values=(0.,))

    in_b, in_c, out_h, out_w = output_size
    out_b, out_c, _, _ = input.shape
    batch_str, channel_str, kern_h_str, kern_w_str = working_input.strides

    return np.lib.stride_tricks.as_strided(
        working_input,
        (out_b, out_c, out_h, out_w, kernel_size, kernel_size),
        (batch_str, channel_str, stride * kern_h_str, stride * kern_w_str, kern_h_str, kern_w_str)
    )

I had a similar use-case where I needed to create sliding windows over a batch of multi-channel images and ended up coming up with the below function. I've written a more in-depth blog post covering this in regards to manually creating a Convolution layer. This function implements the sliding windows and also includes dilating or adding padding to the input array.

The function takes as input:

input - Size of (Batch, Channel, Height, Width) output_size - Depends on usage, comments below. kernel_size - size of the sliding window you wish to create (square) padding - amount of 0-padding added to the outside of the (H,W) dimensions stride - stride the sliding window should take over the inputs dilate - amount to spread the cells of the input. This adds 0-filled rows/cols between elements

Typically, when performing forward convolution, you do not need to perform dilation so your output size can be found be using the following formula (replace x with input dimension):

(x - kernel_size + 2 * padding) // stride + 1

When performing the backwards pass of convolution with this function, use a stride of 1 and set your output_size to the size of your forward pass's x-input

Sample code with an example of using this function can be found at this link.

def getWindows(input, output_size, kernel_size, padding=0, stride=1, dilate=0):
    working_input = input
    working_pad = padding
    # dilate the input if necessary
    if dilate != 0:
        working_input = np.insert(working_input, range(1, input.shape[2]), 0, axis=2)
        working_input = np.insert(working_input, range(1, input.shape[3]), 0, axis=3)

    # pad the input if necessary
    if working_pad != 0:
        working_input = np.pad(working_input, pad_width=((0,), (0,), (working_pad,), (working_pad,)), mode='constant', constant_values=(0.,))

    in_b, in_c, out_h, out_w = output_size
    out_b, out_c, _, _ = input.shape
    batch_str, channel_str, kern_h_str, kern_w_str = working_input.strides

    return np.lib.stride_tricks.as_strided(
        working_input,
        (out_b, out_c, out_h, out_w, kernel_size, kernel_size),
        (batch_str, channel_str, stride * kern_h_str, stride * kern_w_str, kern_h_str, kern_w_str)
    )

回复收藏 0 原文

~没有更多了~