FFT 卷积 - 如何应用 Kernel
我对图像处理还很陌生,发现 FFT 卷积可以大大加快大内核尺寸的卷积速度。
我的问题是,使用 KissFFT 时如何将内核应用于频率空间中的图像?
我已经执行了以下操作:
//I have an image with RGB pixels and given width/height
const int dim[2] = {height, width}; // dimensions of fft
const int dimcount = 2; // number of dimensions. here 2
kiss_fftnd_cfg stf = kiss_fftnd_alloc(dim, dimcount, 0, 0, 0); // forward 2d
kiss_fftnd_cfg sti = kiss_fftnd_alloc(dim, dimcount, 1, 0, 0); // inverse 2d
kiss_fft_cpx *a = new kiss_fft_cpx[width * height];
kiss_fft_cpx *r = new kiss_fft_cpx[width * height];
kiss_fft_cpx *g = new kiss_fft_cpx[width * height];
kiss_fft_cpx *b = new kiss_fft_cpx[width * height];
kiss_fft_cpx *mask = new kiss_fft_cpx[width * height];
kiss_fft_cpx *outa = new kiss_fft_cpx[width * height];
kiss_fft_cpx *outr = new kiss_fft_cpx[width * height];
kiss_fft_cpx *outg = new kiss_fft_cpx[width * height];
kiss_fft_cpx *outb = new kiss_fft_cpx[width * height];
kiss_fft_cpx *outmask = new kiss_fft_cpx[width * height];
for(unsigned int i=0; i<height; i++) {
for(unsigned int l=0; l<width; l++) {
float red = intToFloat((int)Input(i,l)->Red);
float green = intToFloat((int)Input(i,l)->Green);
float blue = intToFloat((int)Input(i,l)->Blue);
int index = i * height + l;
a[index].r = 1.0;
r[index].r = red;
g[index].r = green;
b[index].r = blue;
}
}
kiss_fftnd(stf, a, outa);
kiss_fftnd(stf, r, outr);
kiss_fftnd(stf, g, outg);
kiss_fftnd(stf, b, outb);
kiss_fftnd(stf, mask, outmask);
kiss_fftnd(sti, outa, a);
kiss_fftnd(sti, outr, r);
kiss_fftnd(sti, outg, g);
当我在图像上再次设置 RGB 值时,我确实恢复了原始图像。所以这个转变是有效的。 例如,如果我想应用内核,我现在应该做什么 9x9 框模糊(1/9、1/9、... 1/9)。
我读过一些有关快速卷积的内容,但它们都是不同的,具体取决于 FFT 的实现。在应用过滤器之前,是否有一种“列表”我必须关心的事情?
我的想法是:
图像大小必须是2的幂; 我必须创建一个与图像大小相同的内核。将中间的 9 个值设为 1/9,其余的设为 0,然后将该核变换到频域,与源图像相乘,然后将源图像变换回来。但这实际上不起作用:DD
I'm pretty new to Image Processing and found out that the FFT convolution speeds up the convolution with large kernel sizes a lot.
My question is, how can I apply a kernel to a image in frequency space when using kissFFT?
I already did the following:
//I have an image with RGB pixels and given width/height
const int dim[2] = {height, width}; // dimensions of fft
const int dimcount = 2; // number of dimensions. here 2
kiss_fftnd_cfg stf = kiss_fftnd_alloc(dim, dimcount, 0, 0, 0); // forward 2d
kiss_fftnd_cfg sti = kiss_fftnd_alloc(dim, dimcount, 1, 0, 0); // inverse 2d
kiss_fft_cpx *a = new kiss_fft_cpx[width * height];
kiss_fft_cpx *r = new kiss_fft_cpx[width * height];
kiss_fft_cpx *g = new kiss_fft_cpx[width * height];
kiss_fft_cpx *b = new kiss_fft_cpx[width * height];
kiss_fft_cpx *mask = new kiss_fft_cpx[width * height];
kiss_fft_cpx *outa = new kiss_fft_cpx[width * height];
kiss_fft_cpx *outr = new kiss_fft_cpx[width * height];
kiss_fft_cpx *outg = new kiss_fft_cpx[width * height];
kiss_fft_cpx *outb = new kiss_fft_cpx[width * height];
kiss_fft_cpx *outmask = new kiss_fft_cpx[width * height];
for(unsigned int i=0; i<height; i++) {
for(unsigned int l=0; l<width; l++) {
float red = intToFloat((int)Input(i,l)->Red);
float green = intToFloat((int)Input(i,l)->Green);
float blue = intToFloat((int)Input(i,l)->Blue);
int index = i * height + l;
a[index].r = 1.0;
r[index].r = red;
g[index].r = green;
b[index].r = blue;
}
}
kiss_fftnd(stf, a, outa);
kiss_fftnd(stf, r, outr);
kiss_fftnd(stf, g, outg);
kiss_fftnd(stf, b, outb);
kiss_fftnd(stf, mask, outmask);
kiss_fftnd(sti, outa, a);
kiss_fftnd(sti, outr, r);
kiss_fftnd(sti, outg, g);
When I set the rgb values again on an image I do get the original image back. So the transformation works.
What should I do now if I want to apply a kernel, for example
a 9x9 box blur (1/9, 1/9, ... 1/9).
I have read some things about Fast convolution, but they're all different, depending on the implementation of the FFT . Is there a kind of "list" what things I have to care before applying a filter ?
The way I think:
The imagesize must be a power of 2;
I must create a kernel, the same size as the image. Put the 9 middle values to 1/9, the rest to 0 and then transform this kernel into frequency domain, multiply the source image with it, then transform the source image back. But that doesn't really work :DD
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在频域中执行的卷积实际上是循环卷积。因此,当内核的非零元素到达图片边缘时,它会环绕并包含图片另一侧的像素,这可能不是您想要的。为了处理这个问题,只需用与内核中非零元素一样多的元素对输入进行零填充(实际上少一个即可)。对于 3x3 内核,您需要在每个维度添加 3-1=2 个零像素。
The convolution performed in the frequency domain is really a circular convolution. So when your non-zero elements of the kernel reach the edge of the picture it wraps around and includes the pixels from the other side of the picture, which is probably not what you want. To deal with this just zero pad the input with as many elements as you have non-zero elements in the kernel (actually one less will do). With a 3x3 kernel you need to add 3-1=2 zero pixels in each dimension.