当前位置：文江博客话题详情

从 OpenCV 获取 HOG 图像特征 + Python？

发布于 2024-11-09 05:12:18 字数 314 浏览 0 评论 0原文

我读过这篇关于如何使用 OpenCV 基于 HOG 的行人检测器的文章：如何使用 OpenCV 检测和跟踪人？

我想使用 HOG 检测图像中其他类型的物体（不仅仅是行人）。然而，HOGDetectMultiScale 的 Python 绑定似乎无法访问实际的 HOG 功能。

有没有办法使用Python + OpenCV直接从任何图像中提取HOG特征？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

夜声 2024-11-16 05:12:18

在 python opencv 中，你可以像这样计算 hog：

 import cv2
 hog = cv2.HOGDescriptor()
 im = cv2.imread(sample)
 h = hog.compute(im)

In python opencv you can compute hog like this:

 import cv2
 hog = cv2.HOGDescriptor()
 im = cv2.imread(sample)
 h = hog.compute(im)

回复收藏 0 原文

巷雨优美回忆 2024-11-16 05:12:18

1.获取内置文档： 在 Python 控制台上执行以下命令将帮助您了解 HOGDescriptor 类的结构：

 import cv2; 
 help(cv2.HOGDescriptor())

2.示例代码： 这是使用不同参数初始化 cv2.HOGDescriptor 的代码片段（我在这里使用的术语是 OpenCV 文档中定义良好的标准术语此处）：

import cv2
image = cv2.imread("test.jpg",0)
winSize = (64,64)
blockSize = (16,16)
blockStride = (8,8)
cellSize = (8,8)
nbins = 9
derivAperture = 1
winSigma = 4.
histogramNormType = 0
L2HysThreshold = 2.0000000000000001e-01
gammaCorrection = 0
nlevels = 64
hog = cv2.HOGDescriptor(winSize,blockSize,blockStride,cellSize,nbins,derivAperture,winSigma,
                        histogramNormType,L2HysThreshold,gammaCorrection,nlevels)
#compute(img[, winStride[, padding[, locations]]]) -> descriptors
winStride = (8,8)
padding = (8,8)
locations = ((10,20),)
hist = hog.compute(image,winStride,padding,locations)

3。推理：生成的猪描述符的维度为：
9 个方向 X（获得 1 次标准化的 4 个角块 + 获得 2 次标准化的边缘上的 6x4 块 + 获得 4 次标准化的 6x6 块）= 1764。因为我只给出了 hog.compute() 的一个位置。

4.另一种初始化方法是从包含所有参数值的 xml 文件进行初始化：

hog = cv2.HOGDescriptor("hog.xml")

要获取 xml 文件，可以执行以下操作：

hog = cv2.HOGDescriptor()
hog.save("hog.xml")

并编辑 xml 文件中的相应参数值。

1. Get Inbuilt Documentation: Following command on your python console will help you know the structure of class HOGDescriptor:

 import cv2; 
 help(cv2.HOGDescriptor())

2. Example Code: Here is a snippet of code to initialize an cv2.HOGDescriptor with different parameters (The terms I used here are standard terms which are well defined in OpenCV documentation here):

import cv2
image = cv2.imread("test.jpg",0)
winSize = (64,64)
blockSize = (16,16)
blockStride = (8,8)
cellSize = (8,8)
nbins = 9
derivAperture = 1
winSigma = 4.
histogramNormType = 0
L2HysThreshold = 2.0000000000000001e-01
gammaCorrection = 0
nlevels = 64
hog = cv2.HOGDescriptor(winSize,blockSize,blockStride,cellSize,nbins,derivAperture,winSigma,
                        histogramNormType,L2HysThreshold,gammaCorrection,nlevels)
#compute(img[, winStride[, padding[, locations]]]) -> descriptors
winStride = (8,8)
padding = (8,8)
locations = ((10,20),)
hist = hog.compute(image,winStride,padding,locations)

3. Reasoning: The resultant hog descriptor will have dimension as:
9 orientations X (4 corner blocks that get 1 normalization + 6x4 blocks on the edges that get 2 normalizations + 6x6 blocks that get 4 normalizations) = 1764. as I have given only one location for hog.compute().

4. One more way to initialize is from xml file which contains all parameter values:

hog = cv2.HOGDescriptor("hog.xml")

To get an xml file one can do following:

hog = cv2.HOGDescriptor()
hog.save("hog.xml")

and edit the respective parameter values in xml file.

回复收藏 0 原文

原谅我要高飞 2024-11-16 05:12:18

这是一个仅使用 OpenCV 的解决方案：

import numpy as np
import cv2
import matplotlib.pyplot as plt

img = cv2.cvtColor(cv2.imread("/home/me/Downloads/cat.jpg"),
                   cv2.COLOR_BGR2GRAY)

cell_size = (8, 8)  # h x w in pixels
block_size = (2, 2)  # h x w in cells
nbins = 9  # number of orientation bins

# winSize is the size of the image cropped to an multiple of the cell size
hog = cv2.HOGDescriptor(_winSize=(img.shape[1] // cell_size[1] * cell_size[1],
                                  img.shape[0] // cell_size[0] * cell_size[0]),
                        _blockSize=(block_size[1] * cell_size[1],
                                    block_size[0] * cell_size[0]),
                        _blockStride=(cell_size[1], cell_size[0]),
                        _cellSize=(cell_size[1], cell_size[0]),
                        _nbins=nbins)

n_cells = (img.shape[0] // cell_size[0], img.shape[1] // cell_size[1])
hog_feats = hog.compute(img)\
               .reshape(n_cells[1] - block_size[1] + 1,
                        n_cells[0] - block_size[0] + 1,
                        block_size[0], block_size[1], nbins) \
               .transpose((1, 0, 2, 3, 4))  # index blocks by rows first
# hog_feats now contains the gradient amplitudes for each direction,
# for each cell of its group for each group. Indexing is by rows then columns.

gradients = np.zeros((n_cells[0], n_cells[1], nbins))

# count cells (border cells appear less often across overlapping groups)
cell_count = np.full((n_cells[0], n_cells[1], 1), 0, dtype=int)

for off_y in range(block_size[0]):
    for off_x in range(block_size[1]):
        gradients[off_y:n_cells[0] - block_size[0] + off_y + 1,
                  off_x:n_cells[1] - block_size[1] + off_x + 1] += \
            hog_feats[:, :, off_y, off_x, :]
        cell_count[off_y:n_cells[0] - block_size[0] + off_y + 1,
                   off_x:n_cells[1] - block_size[1] + off_x + 1] += 1

# Average gradients
gradients /= cell_count

# Preview
plt.figure()
plt.imshow(img, cmap='gray')
plt.show()

bin = 5  # angle is 360 / nbins * direction
plt.pcolor(gradients[:, :, bin])
plt.gca().invert_yaxis()
plt.gca().set_aspect('equal', adjustable='box')
plt.colorbar()
plt.show()

我使用了 HOG 描述符计算和可视化了解数据布局并对组上的循环进行矢量化。

Here is a solution that uses only OpenCV:

import numpy as np
import cv2
import matplotlib.pyplot as plt

img = cv2.cvtColor(cv2.imread("/home/me/Downloads/cat.jpg"),
                   cv2.COLOR_BGR2GRAY)

cell_size = (8, 8)  # h x w in pixels
block_size = (2, 2)  # h x w in cells
nbins = 9  # number of orientation bins

# winSize is the size of the image cropped to an multiple of the cell size
hog = cv2.HOGDescriptor(_winSize=(img.shape[1] // cell_size[1] * cell_size[1],
                                  img.shape[0] // cell_size[0] * cell_size[0]),
                        _blockSize=(block_size[1] * cell_size[1],
                                    block_size[0] * cell_size[0]),
                        _blockStride=(cell_size[1], cell_size[0]),
                        _cellSize=(cell_size[1], cell_size[0]),
                        _nbins=nbins)

n_cells = (img.shape[0] // cell_size[0], img.shape[1] // cell_size[1])
hog_feats = hog.compute(img)\
               .reshape(n_cells[1] - block_size[1] + 1,
                        n_cells[0] - block_size[0] + 1,
                        block_size[0], block_size[1], nbins) \
               .transpose((1, 0, 2, 3, 4))  # index blocks by rows first
# hog_feats now contains the gradient amplitudes for each direction,
# for each cell of its group for each group. Indexing is by rows then columns.

gradients = np.zeros((n_cells[0], n_cells[1], nbins))

# count cells (border cells appear less often across overlapping groups)
cell_count = np.full((n_cells[0], n_cells[1], 1), 0, dtype=int)

for off_y in range(block_size[0]):
    for off_x in range(block_size[1]):
        gradients[off_y:n_cells[0] - block_size[0] + off_y + 1,
                  off_x:n_cells[1] - block_size[1] + off_x + 1] += \
            hog_feats[:, :, off_y, off_x, :]
        cell_count[off_y:n_cells[0] - block_size[0] + off_y + 1,
                   off_x:n_cells[1] - block_size[1] + off_x + 1] += 1

# Average gradients
gradients /= cell_count

# Preview
plt.figure()
plt.imshow(img, cmap='gray')
plt.show()

bin = 5  # angle is 360 / nbins * direction
plt.pcolor(gradients[:, :, bin])
plt.gca().invert_yaxis()
plt.gca().set_aspect('equal', adjustable='box')
plt.colorbar()
plt.show()

I have used HOG descriptor computation and visualization to understand the data layout and vectorized the loops over groups.

回复收藏 0 原文

浅浅淡淡 2024-11-16 05:12:18

尽管事实上存在一种方法，如前面的答案中所述：

hog = cv2.HOGDescriptor()

我想发布一个Python实现，你可以在opencv的示例目录中找到，希望它对理解HOG功能有用：

def hog(img):
    gx = cv2.Sobel(img, cv2.CV_32F, 1, 0)
    gy = cv2.Sobel(img, cv2.CV_32F, 0, 1)
    mag, ang = cv2.cartToPolar(gx, gy)
    bin_n = 16 # Number of bins
    bin = np.int32(bin_n*ang/(2*np.pi))

    bin_cells = []
    mag_cells = []

    cellx = celly = 8

    for i in range(0,img.shape[0]/celly):
        for j in range(0,img.shape[1]/cellx):
            bin_cells.append(bin[i*celly : i*celly+celly, j*cellx : j*cellx+cellx])
            mag_cells.append(mag[i*celly : i*celly+celly, j*cellx : j*cellx+cellx])   

    hists = [np.bincount(b.ravel(), m.ravel(), bin_n) for b, m in zip(bin_cells, mag_cells)]
    hist = np.hstack(hists)

    # transform to Hellinger kernel
    eps = 1e-7
    hist /= hist.sum() + eps
    hist = np.sqrt(hist)
    hist /= norm(hist) + eps

    return hist

问候。

Despite the fact that exist a method as said in previous answers:

hog = cv2.HOGDescriptor()

I would like to post a python implementation you can find on opencv's examples directory, hoping it can be useful to understand HOG funcionallity:

def hog(img):
    gx = cv2.Sobel(img, cv2.CV_32F, 1, 0)
    gy = cv2.Sobel(img, cv2.CV_32F, 0, 1)
    mag, ang = cv2.cartToPolar(gx, gy)
    bin_n = 16 # Number of bins
    bin = np.int32(bin_n*ang/(2*np.pi))

    bin_cells = []
    mag_cells = []

    cellx = celly = 8

    for i in range(0,img.shape[0]/celly):
        for j in range(0,img.shape[1]/cellx):
            bin_cells.append(bin[i*celly : i*celly+celly, j*cellx : j*cellx+cellx])
            mag_cells.append(mag[i*celly : i*celly+celly, j*cellx : j*cellx+cellx])   

    hists = [np.bincount(b.ravel(), m.ravel(), bin_n) for b, m in zip(bin_cells, mag_cells)]
    hist = np.hstack(hists)

    # transform to Hellinger kernel
    eps = 1e-7
    hist /= hist.sum() + eps
    hist = np.sqrt(hist)
    hist /= norm(hist) + eps

    return hist

Regards.

回复收藏 0 原文

秋意浓 2024-11-16 05:12:18

我不同意peakxu的论点。 HOG 探测器最终“只是”一个刚性线性滤波器。 “物体”（即人）的任何自由度都会导致检测器模糊，并且实际上不会被检测器处理。该检测器使用潜在 SVM 进行了扩展，通过在独立部分（即头部、手臂等）之间引入结构约束以及允许每个对象（即正面人物和侧面人物）出现多种外观来明确处理自由度。 .)。

关于opencv中的HOG检测器：理论上你可以上传另一个检测器来与这些功能一起使用，但你不能获取这些功能本身。因此，如果您有经过训练的检测器（即特定类别的线性滤波器），您应该能够将其上传到检测器中以获得 opencv 的快速检测性能。也就是说，应该很容易破解 opencv 源代码以提供此访问权限并将此补丁反馈给维护人员。

回复收藏 0 原文