使用 OpenCV 计算图像的离散余弦变换

发布于 2024-12-12 07:37:05 字数 521 浏览 2 评论 0原文

我正在尝试使用 OpenCV 2.3 Python 包装器来计算图像的 DCT。据说,图像 == numpy 数组 == CV 矩阵,所以我认为这应该可行:

import cv2
img1 = cv2.imread('myimage.jpg', cv2.CV_LOAD_IMAGE_GRAYSCALE)
img2 = cv2.dct(img1)

但是,这会引发错误:

cv2.error: /usr/local/lib/OpenCV-2.3.1/modules/core/src/dxt.cpp:2247: error: (-215) type == CV_32FC1 || type == CV_64FC1 in function dct

我意识到该错误意味着输入应该是 32 位或 64 位单通道浮点矩阵。然而,我认为这就是在指定灰度时我的图像应该如何加载,或者至少它应该足够接近,以便 CV2 应该能够计算出转换。

使用 cv2 将图像转换为 DCT 的正确方法是什么?

I'm trying to use the OpenCV 2.3 Python wrapper to calculate the DCT for an image. Supposedly, images == numpy arrays == CV matrices, so I thought this should work:

import cv2
img1 = cv2.imread('myimage.jpg', cv2.CV_LOAD_IMAGE_GRAYSCALE)
img2 = cv2.dct(img1)

However, this throws the error:

cv2.error: /usr/local/lib/OpenCV-2.3.1/modules/core/src/dxt.cpp:2247: error: (-215) type == CV_32FC1 || type == CV_64FC1 in function dct

I realize the error means the input should be either a 32-bit or 64-bit single-channel floating point matrix. However, I thought that's how my image should have loaded when specifying grayscale, or at least it should be close enough so that CV2 should be able to figure out the conversion.

What's the appropriate way to convert an image for DCT using cv2?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

入画浅相思 2024-12-19 07:37:05

使用 cv2 似乎没有任何简单的方法可以做到这一点。我能找到的最接近的解决方案是:

import cv, cv2
import numpy as np

img1 = cv2.imread('myimage.jpg', cv2.CV_LOAD_IMAGE_GRAYSCALE)
h, w = img1.shape[:2]
vis0 = np.zeros((h,w), np.float32)
vis0[:h, :w] = img1
vis1 = cv2.dct(vis0)
img2 = cv.CreateMat(vis1.shape[0], vis1.shape[1], cv.CV_32FC3)
cv.CvtColor(cv.fromarray(vis1), img2, cv.CV_GRAY2BGR)

cv.SaveImage('output.jpg', img2)

There doesn't seem to be any easy way to do this with cv2. The closest solution I could find is:

import cv, cv2
import numpy as np

img1 = cv2.imread('myimage.jpg', cv2.CV_LOAD_IMAGE_GRAYSCALE)
h, w = img1.shape[:2]
vis0 = np.zeros((h,w), np.float32)
vis0[:h, :w] = img1
vis1 = cv2.dct(vis0)
img2 = cv.CreateMat(vis1.shape[0], vis1.shape[1], cv.CV_32FC3)
cv.CvtColor(cv.fromarray(vis1), img2, cv.CV_GRAY2BGR)

cv.SaveImage('output.jpg', img2)
成熟的代价 2024-12-19 07:37:05

我本来不想写这个答案,但当我看到一些答案虽然错误却被投票通过时,我决定写下来。

dct 运算适用于任何范围的输入,所以我真的不明白为什么其他人将其缩放到 [0, 1]。但在opencv中,你需要传递numpy.float32数字。

x = np.array([8, 16, 24 , 32, 40, 48, 56, 64])
cv2.dct(np.float32(x))

# output
array([[ 101.82337189],
       [ -51.53858566],
       [   0.        ],
       [  -5.38763857],
       [   0.        ],
       [  -1.60722351],
       [   0.        ],
       [  -0.40561762]], dtype=float32)

但如果你缩放它,几乎所有的小值都会丢失。

以下是公式和示例的链接:
https://users.cs.cf.ac。 uk/Dave.Marshall/Multimedia/node231.html#DCTbasis

I did not want to write this answer but as I seen some answers are voted up while they are wrong, I decided to write.

The dct operation works on inputs in any range so I really do not understand why others scaled it to [0, 1]. But in opencv, you need to pass numpy.float32 numbers.

x = np.array([8, 16, 24 , 32, 40, 48, 56, 64])
cv2.dct(np.float32(x))

# output
array([[ 101.82337189],
       [ -51.53858566],
       [   0.        ],
       [  -5.38763857],
       [   0.        ],
       [  -1.60722351],
       [   0.        ],
       [  -0.40561762]], dtype=float32)

But if you scale it, almost all small values will be lost.

Here is a link to formula and examples:
https://users.cs.cf.ac.uk/Dave.Marshall/Multimedia/node231.html#DCTbasis

末骤雨初歇 2024-12-19 07:37:05

这是我从 openCV 论坛获得的一个解决方案,它有效。

img = cv2.imread(fn, 0)      # 1 chan, grayscale!
imf = np.float32(img)/255.0  # float conversion/scale
dst = cv2.dct(imf)           # the dct
img = np.uint8(dst)*255.0    # convert back

Here is a solution that I got from openCV forums and it worked.

img = cv2.imread(fn, 0)      # 1 chan, grayscale!
imf = np.float32(img)/255.0  # float conversion/scale
dst = cv2.dct(imf)           # the dct
img = np.uint8(dst)*255.0    # convert back
不爱素颜 2024-12-19 07:37:05

好吧,当您将图像加载为灰度时,它实际上是以每像素 8 位读取的,而不是作为 32 位浮点值读取的。

以下是您的操作方法:

img1_32f = cv.CreateImage( cv.GetSize(img1), cv.IPL_DEPTH_64F, 1)
cv.Scale(img1, img1_32f, 1.0, 0.0)

另外,请查看 dft.py 示例。这应该会让您了解如何使用 dft 。

Well, when you load the image as grayscale, it is actually read in at 8-bits per pixel and not as 32-bit float values.

Here is how you would do it:

img1_32f = cv.CreateImage( cv.GetSize(img1), cv.IPL_DEPTH_64F, 1)
cv.Scale(img1, img1_32f, 1.0, 0.0)

Also, have a look at the dft.py example. This should give you a feel for how to use the dft as well.

痴梦一场 2024-12-19 07:37:05

Numpy 具有切片运算符,用于在不同顺序的数组之间工作。

import cv2
import cv2.cv as cv
import numpy as np   

img1 = cv2.imread('myimage.jpg')
# or use cv2.CV_LOAD_IMAGE_GRAYSCALE 
img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
cv2.imshow('input', img1)
w,h = img1.shape
# make a 32bit float for doing the dct within
img2 = np.zeros((w,h), dtype=np.float32)
print img1.shape, img2.shape
img2 = img2+img1[:w, :h]
dct1 = cv2.dct(img2)
key = -1
while(key < 0):
    cv2.imshow("DCT", dct1)
    key = cv2.waitKey(1)
cv2.destroyAllWindows()

Numpy has slice operators for working between arrays of different orders.

import cv2
import cv2.cv as cv
import numpy as np   

img1 = cv2.imread('myimage.jpg')
# or use cv2.CV_LOAD_IMAGE_GRAYSCALE 
img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
cv2.imshow('input', img1)
w,h = img1.shape
# make a 32bit float for doing the dct within
img2 = np.zeros((w,h), dtype=np.float32)
print img1.shape, img2.shape
img2 = img2+img1[:w, :h]
dct1 = cv2.dct(img2)
key = -1
while(key < 0):
    cv2.imshow("DCT", dct1)
    key = cv2.waitKey(1)
cv2.destroyAllWindows()
尾戒 2024-12-19 07:37:05

下面是如何使用 scipy 执行此操作:

import os.path
import numpy as np
from PIL import Image
from scipy.fftpack import fft, dct

if __name__ == '__main__':
    image_counter = 1

    # Apply DCT to the noisy image patches.
    noise_image_path = 'noise_images/' + str(image_counter) + '.png'
    noise_image = Image.open(noise_image_path)         
    noise_dct_data = dct(noise_image)
    print(noise_dct_data)

Heres how to do it with scipy:

import os.path
import numpy as np
from PIL import Image
from scipy.fftpack import fft, dct

if __name__ == '__main__':
    image_counter = 1

    # Apply DCT to the noisy image patches.
    noise_image_path = 'noise_images/' + str(image_counter) + '.png'
    noise_image = Image.open(noise_image_path)         
    noise_dct_data = dct(noise_image)
    print(noise_dct_data)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文