从 OpenCV 获取 HOG 图像特征 + Python?

发布于 2024-11-09 05:12:18 字数 314 浏览 0 评论 0原文

我读过这篇关于如何使用 OpenCV 基于 HOG 的行人检测器的文章:如何使用 OpenCV 检测和跟踪人?

我想使用 HOG 检测图像中其他类型的物体(不仅仅是行人)。然而,HOGDetectMultiScale 的 Python 绑定似乎无法访问实际的 HOG 功能。

有没有办法使用Python + OpenCV直接从任何图像中提取HOG特征?

I've read this post about how to use OpenCV's HOG-based pedestrian detector: How can I detect and track people using OpenCV?

I want to use HOG for detecting other types of objects in images (not just pedestrians). However, the Python binding of HOGDetectMultiScale doesn't seem to give access to the actual HOG features.

Is there any way to use Python + OpenCV to extract the HOG features directly from any image?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

夜声 2024-11-16 05:12:18

在 python opencv 中,你可以像这样计算 hog:

 import cv2
 hog = cv2.HOGDescriptor()
 im = cv2.imread(sample)
 h = hog.compute(im)

In python opencv you can compute hog like this:

 import cv2
 hog = cv2.HOGDescriptor()
 im = cv2.imread(sample)
 h = hog.compute(im)
巷雨优美回忆 2024-11-16 05:12:18

1.获取内置文档: 在 Python 控制台上执行以下命令将帮助您了解 HOGDescriptor 类的结构:

 import cv2; 
 help(cv2.HOGDescriptor())

2.示例代码: 这是使用不同参数初始化 cv2.HOGDescriptor 的代码片段(我在这里使用的术语是 OpenCV 文档中定义良好的标准术语 此处):

import cv2
image = cv2.imread("test.jpg",0)
winSize = (64,64)
blockSize = (16,16)
blockStride = (8,8)
cellSize = (8,8)
nbins = 9
derivAperture = 1
winSigma = 4.
histogramNormType = 0
L2HysThreshold = 2.0000000000000001e-01
gammaCorrection = 0
nlevels = 64
hog = cv2.HOGDescriptor(winSize,blockSize,blockStride,cellSize,nbins,derivAperture,winSigma,
                        histogramNormType,L2HysThreshold,gammaCorrection,nlevels)
#compute(img[, winStride[, padding[, locations]]]) -> descriptors
winStride = (8,8)
padding = (8,8)
locations = ((10,20),)
hist = hog.compute(image,winStride,padding,locations)

3。推理:生成的猪描述符的维度为:
9 个方向 X(获得 1 次标准化的 4 个角块 + 获得 2 次标准化的边缘上的 6x4 块 + 获得 4 次标准化的 6x6 块)= 1764。因为我只给出了 hog.compute() 的一个位置。

4.另一种初始化方法是从包含所有参数值的 xml 文件进行初始化:

hog = cv2.HOGDescriptor("hog.xml")

要获取 xml 文件,可以执行以下操作:

hog = cv2.HOGDescriptor()
hog.save("hog.xml")

并编辑 xml 文件中的相应参数值。

1. Get Inbuilt Documentation: Following command on your python console will help you know the structure of class HOGDescriptor:

 import cv2; 
 help(cv2.HOGDescriptor())

2. Example Code: Here is a snippet of code to initialize an cv2.HOGDescriptor with different parameters (The terms I used here are standard terms which are well defined in OpenCV documentation here):

import cv2
image = cv2.imread("test.jpg",0)
winSize = (64,64)
blockSize = (16,16)
blockStride = (8,8)
cellSize = (8,8)
nbins = 9
derivAperture = 1
winSigma = 4.
histogramNormType = 0
L2HysThreshold = 2.0000000000000001e-01
gammaCorrection = 0
nlevels = 64
hog = cv2.HOGDescriptor(winSize,blockSize,blockStride,cellSize,nbins,derivAperture,winSigma,
                        histogramNormType,L2HysThreshold,gammaCorrection,nlevels)
#compute(img[, winStride[, padding[, locations]]]) -> descriptors
winStride = (8,8)
padding = (8,8)
locations = ((10,20),)
hist = hog.compute(image,winStride,padding,locations)

3. Reasoning: The resultant hog descriptor will have dimension as:
9 orientations X (4 corner blocks that get 1 normalization + 6x4 blocks on the edges that get 2 normalizations + 6x6 blocks that get 4 normalizations) = 1764. as I have given only one location for hog.compute().

4. One more way to initialize is from xml file which contains all parameter values:

hog = cv2.HOGDescriptor("hog.xml")

To get an xml file one can do following:

hog = cv2.HOGDescriptor()
hog.save("hog.xml")

and edit the respective parameter values in xml file.

原谅我要高飞 2024-11-16 05:12:18

这是一个仅使用 OpenCV 的解决方案:

import numpy as np
import cv2
import matplotlib.pyplot as plt

img = cv2.cvtColor(cv2.imread("/home/me/Downloads/cat.jpg"),
                   cv2.COLOR_BGR2GRAY)

cell_size = (8, 8)  # h x w in pixels
block_size = (2, 2)  # h x w in cells
nbins = 9  # number of orientation bins

# winSize is the size of the image cropped to an multiple of the cell size
hog = cv2.HOGDescriptor(_winSize=(img.shape[1] // cell_size[1] * cell_size[1],
                                  img.shape[0] // cell_size[0] * cell_size[0]),
                        _blockSize=(block_size[1] * cell_size[1],
                                    block_size[0] * cell_size[0]),
                        _blockStride=(cell_size[1], cell_size[0]),
                        _cellSize=(cell_size[1], cell_size[0]),
                        _nbins=nbins)

n_cells = (img.shape[0] // cell_size[0], img.shape[1] // cell_size[1])
hog_feats = hog.compute(img)\
               .reshape(n_cells[1] - block_size[1] + 1,
                        n_cells[0] - block_size[0] + 1,
                        block_size[0], block_size[1], nbins) \
               .transpose((1, 0, 2, 3, 4))  # index blocks by rows first
# hog_feats now contains the gradient amplitudes for each direction,
# for each cell of its group for each group. Indexing is by rows then columns.

gradients = np.zeros((n_cells[0], n_cells[1], nbins))

# count cells (border cells appear less often across overlapping groups)
cell_count = np.full((n_cells[0], n_cells[1], 1), 0, dtype=int)

for off_y in range(block_size[0]):
    for off_x in range(block_size[1]):
        gradients[off_y:n_cells[0] - block_size[0] + off_y + 1,
                  off_x:n_cells[1] - block_size[1] + off_x + 1] += \
            hog_feats[:, :, off_y, off_x, :]
        cell_count[off_y:n_cells[0] - block_size[0] + off_y + 1,
                   off_x:n_cells[1] - block_size[1] + off_x + 1] += 1

# Average gradients
gradients /= cell_count

# Preview
plt.figure()
plt.imshow(img, cmap='gray')
plt.show()

bin = 5  # angle is 360 / nbins * direction
plt.pcolor(gradients[:, :, bin])
plt.gca().invert_yaxis()
plt.gca().set_aspect('equal', adjustable='box')
plt.colorbar()
plt.show()

我使用了 HOG 描述符计算和可视化 了解数据布局并对组上的循环进行矢量化。

Here is a solution that uses only OpenCV:

import numpy as np
import cv2
import matplotlib.pyplot as plt

img = cv2.cvtColor(cv2.imread("/home/me/Downloads/cat.jpg"),
                   cv2.COLOR_BGR2GRAY)

cell_size = (8, 8)  # h x w in pixels
block_size = (2, 2)  # h x w in cells
nbins = 9  # number of orientation bins

# winSize is the size of the image cropped to an multiple of the cell size
hog = cv2.HOGDescriptor(_winSize=(img.shape[1] // cell_size[1] * cell_size[1],
                                  img.shape[0] // cell_size[0] * cell_size[0]),
                        _blockSize=(block_size[1] * cell_size[1],
                                    block_size[0] * cell_size[0]),
                        _blockStride=(cell_size[1], cell_size[0]),
                        _cellSize=(cell_size[1], cell_size[0]),
                        _nbins=nbins)

n_cells = (img.shape[0] // cell_size[0], img.shape[1] // cell_size[1])
hog_feats = hog.compute(img)\
               .reshape(n_cells[1] - block_size[1] + 1,
                        n_cells[0] - block_size[0] + 1,
                        block_size[0], block_size[1], nbins) \
               .transpose((1, 0, 2, 3, 4))  # index blocks by rows first
# hog_feats now contains the gradient amplitudes for each direction,
# for each cell of its group for each group. Indexing is by rows then columns.

gradients = np.zeros((n_cells[0], n_cells[1], nbins))

# count cells (border cells appear less often across overlapping groups)
cell_count = np.full((n_cells[0], n_cells[1], 1), 0, dtype=int)

for off_y in range(block_size[0]):
    for off_x in range(block_size[1]):
        gradients[off_y:n_cells[0] - block_size[0] + off_y + 1,
                  off_x:n_cells[1] - block_size[1] + off_x + 1] += \
            hog_feats[:, :, off_y, off_x, :]
        cell_count[off_y:n_cells[0] - block_size[0] + off_y + 1,
                   off_x:n_cells[1] - block_size[1] + off_x + 1] += 1

# Average gradients
gradients /= cell_count

# Preview
plt.figure()
plt.imshow(img, cmap='gray')
plt.show()

bin = 5  # angle is 360 / nbins * direction
plt.pcolor(gradients[:, :, bin])
plt.gca().invert_yaxis()
plt.gca().set_aspect('equal', adjustable='box')
plt.colorbar()
plt.show()

I have used HOG descriptor computation and visualization to understand the data layout and vectorized the loops over groups.

浅浅淡淡 2024-11-16 05:12:18

尽管事实上存在一种方法,如前面的答案中所述:

hog = cv2.HOGDescriptor()

我想发布一个Python实现,你可以在opencv的示例目录中找到,希望它对理解HOG功能有用:

def hog(img):
    gx = cv2.Sobel(img, cv2.CV_32F, 1, 0)
    gy = cv2.Sobel(img, cv2.CV_32F, 0, 1)
    mag, ang = cv2.cartToPolar(gx, gy)
    bin_n = 16 # Number of bins
    bin = np.int32(bin_n*ang/(2*np.pi))

    bin_cells = []
    mag_cells = []

    cellx = celly = 8

    for i in range(0,img.shape[0]/celly):
        for j in range(0,img.shape[1]/cellx):
            bin_cells.append(bin[i*celly : i*celly+celly, j*cellx : j*cellx+cellx])
            mag_cells.append(mag[i*celly : i*celly+celly, j*cellx : j*cellx+cellx])   

    hists = [np.bincount(b.ravel(), m.ravel(), bin_n) for b, m in zip(bin_cells, mag_cells)]
    hist = np.hstack(hists)

    # transform to Hellinger kernel
    eps = 1e-7
    hist /= hist.sum() + eps
    hist = np.sqrt(hist)
    hist /= norm(hist) + eps

    return hist

问候。

Despite the fact that exist a method as said in previous answers:

hog = cv2.HOGDescriptor()

I would like to post a python implementation you can find on opencv's examples directory, hoping it can be useful to understand HOG funcionallity:

def hog(img):
    gx = cv2.Sobel(img, cv2.CV_32F, 1, 0)
    gy = cv2.Sobel(img, cv2.CV_32F, 0, 1)
    mag, ang = cv2.cartToPolar(gx, gy)
    bin_n = 16 # Number of bins
    bin = np.int32(bin_n*ang/(2*np.pi))

    bin_cells = []
    mag_cells = []

    cellx = celly = 8

    for i in range(0,img.shape[0]/celly):
        for j in range(0,img.shape[1]/cellx):
            bin_cells.append(bin[i*celly : i*celly+celly, j*cellx : j*cellx+cellx])
            mag_cells.append(mag[i*celly : i*celly+celly, j*cellx : j*cellx+cellx])   

    hists = [np.bincount(b.ravel(), m.ravel(), bin_n) for b, m in zip(bin_cells, mag_cells)]
    hist = np.hstack(hists)

    # transform to Hellinger kernel
    eps = 1e-7
    hist /= hist.sum() + eps
    hist = np.sqrt(hist)
    hist /= norm(hist) + eps

    return hist

Regards.

秋意浓 2024-11-16 05:12:18

我不同意peakxu的论点。 HOG 探测器最终“只是”一个刚性线性滤波器。 “物体”(即人)的任何自由度都会导致检测器模糊,并且实际上不会被检测器处理。该检测器使用潜在 SVM 进行了扩展,通过在独立部分(即头部、手臂等)之间引入结构约束以及允许每个对象(即正面人物和侧面人物)出现多种外观来明确处理自由度。 .)。

关于opencv中的HOG检测器:理论上你可以上传另一个检测器来与这些功能一起使用,但你不能获取这些功能本身。因此,如果您有经过训练的检测器(即特定类别的线性滤波器),您应该能够将其上传到检测器中以获得 opencv 的快速检测性能。也就是说,应该很容易破解 opencv 源代码以提供此访问权限并将此补丁反馈给维护人员。

I would disagree with the argument of peakxu. The HOG detector in the end is "just" a rigid linear filter. any degrees of freedom in the "object" (i.e. persons) lead to bluring in the detector, and are not actually handled by it. There is an extension of this detector using latent SVMs that does explicitly handle dgrees of freedom by introducing structural constraints between independent parts (i.e. head, arms, etc) as well as allowing for multiple appearances per object (i.e. frontal people and sideways people...).

Regarding the HOG detector in opencv: In theory you can upload another detector to be used with the features, but you cannot afaik get the features themselves. thus, if you have a trained detector (i.e. a class specific linear filter) you should be able to upload that into the detector to get the fast detections performance of opencv. that said it should be easy to hack the opencv source code to provide this access and propose this patch back to the maintainers.

晨与橙与城 2024-11-16 05:12:18

我不建议使用 HOG 特征来检测行人以外的物体。在 Dalal 和 Triggs 最初的 HOG 论文中,他们特别提到他们的检测器是围绕行人检测构建的,允许四肢具有很大的自由度,同时使用人体周围的强烈结构提示。

相反,请尝试查看 OpenCV 的 HaarDetectObjects。您可以在此处了解如何训练自己的级联。

I would not recommend using HOG features for detecting objects other than pedestrians. In the original HOG paper by Dalal and Triggs, they specifically mentioned that their detector is built around pedestrian detection in allowing for significant degrees of freedom in the limbs while using strong structural hints around human body.

Instead, try looking at OpenCV's HaarDetectObjects. You can learn how to train your own cascades here.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文