低分辨率移动视频中对象检测的最佳方法是什么?
我正在寻找最快、更有效的方法来检测移动视频中的对象。该视频需要注意的事项:颗粒感很强,分辨率低,而且背景和前景同时移动。
注意:我正在尝试在移动视频中检测道路上移动的卡车。
我尝试过的方法:
训练 Haar Cascade - 我尝试通过拍摄所需对象的多个图像来训练分类器来识别对象。事实证明,这会产生许多错误检测或根本没有检测到(从未检测到所需的对象)。我使用了大约 100 张正片和 4000 张负片。
SIFT 和 SURF 关键点 - 当尝试使用这两种基于特征的方法时,我发现我想要检测的对象分辨率太低,因此没有足够的特征来匹配以进行准确的检测。 (从未检测到所需的对象)
模板匹配 - 这可能是我尝试过的最好的方法。这是其中最准确的,但也是最老套的。我可以使用从视频中裁剪的模板来检测特定视频的对象。但是,无法保证准确性,因为已知的只是每个帧的最佳匹配,不会对模板与帧匹配的百分比进行分析。基本上,它仅在对象始终位于视频中时才有效,否则会产生错误检测。
这就是我尝试过的三种主要方法,但都失败了。最有效的是类似模板匹配的东西,但具有尺度和旋转不变性(这导致我尝试 SIFT/SURF),但我不知道如何修改模板匹配函数。
有人对如何最好地完成这项任务有任何建议吗?
I'm looking for the fastest and more efficient method of detecting an object in a moving video. Things to note about this video: It is very grainy and low resolution, also both the background and foreground are moving simultaneously.
Note: I'm trying to detect a moving truck on a road in a moving video.
Methods I've tried:
Training a Haar Cascade - I've attempted training the classifiers to identify the object by taking copping multiple images of the desired object. This proved to produce either many false detects or no detects at all (the object desired was never detected). I used about 100 positive images and 4000 negatives.
SIFT and SURF Keypoints - When attempting to use either of these methods which is based on features, I discovered that the object I wanted to detect was too low in resolution, so there were not enough features to match to make an accurate detection. (Object desired was never detected)
Template Matching - This is probably the best method I've tried. It's the most accurate although the most hacky of them all. I can detect the object for one specific video using a template cropped from the video. However, there is no guaranteed accuracy because all that is known is the best match for each frame, no analysis is done on the percentage template matches the frame. Basically, it only works if the object is always in the video, otherwise it will create a false detect.
So those are the big 3 methods I've tried and all have failed. What would work best is something like template matching but with scale and rotation invariance (which led me to try SIFT/SURF), but i have no idea how to modify the template matching function.
Does anyone have any suggestions how to best accomplish this task?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
将光流应用于图像,然后根据流场对其进行分割。背景流与“对象”流有很大不同(它主要根据它是否朝向或远离您移动而发散或汇聚,还具有一些横向分量)。
这是一个以这种方式工作的旧项目:
http://users.fmrib .ox.ac.uk/~steve/asset/index.html
Apply optical flow to the image and then segment it based on flow field. Background flow is very different from "object" flow (which mainly diverges or converges depending on whether it is moving towards or away from you, with some lateral component also).
Here's an oldish project which worked this way:
http://users.fmrib.ox.ac.uk/~steve/asset/index.html
这篇车辆检测论文使用Gabor滤波器组< /strong> 用于低级别检测,然后使用响应创建特征空间,在其中训练 SVM 分类器。
该技术似乎运作良好,并且至少是尺度不变的。但我不确定旋转。
This vehicle detection paper uses a Gabor filter bank for low level detection and then uses the response to create the features space where it trains an SVM classifier.
The technique seems to work well and is at least scale invariant. I am not sure about rotation though.
不了解您的应用程序,我的初步印象是归一化互相关,特别是因为我记得看到过一个以车辆跟踪为示例应用的纯光学互相关器。 (仅使用光学组件和车辆侧面的图像来跟踪经过的车辆 - 我希望我能找到链接。)这与“模板匹配”类似(如果不相同),您说这是可行的,但如您所知,如果图像旋转,这将不起作用。
不过,有一个基于对数极坐标的相关方法 无论旋转、缩放、剪切和平移如何,它都可以工作。
我想这也将能够跟踪该对象也已离开视频场景,因为最大相关性将会降低。
Not knowing your application, my initial impression is normalized cross-correlation, especially since I remember seeing a purely optical cross-correlator that had vehicle-tracking as the example application. (Tracking a vehicle as it passes using only optical components and an image of the side of the vehicle - I wish I could find the link.) This is similar (if not identical) to "template matching", which you say kind of works, but this won't work if the images are rotated, as you know.
However, there's a related method based on log-polar coordinates that will work regardless of rotation, scale, shear, and translation.
I imagine this would also enable tracking that the object has left the scene of the video, too, since the maximum correlation will decrease.
我们所说的分辨率有多低?您还可以详细说明一下该对象吗?它是特定的颜色吗?它有模式吗?答案会影响您应该使用什么。
另外,我可能会错误地阅读您的模板匹配语句,但听起来您正在过度训练它(通过在您提取对象的同一视频上进行测试??)。
How low resolution are we talking? Could you also elaborate on the object? Is it a specific color? Does it have a pattern? The answers affect what you should be using.
Also, I might be reading your template matching statement wrong, but it sounds like you are overtraining it (by testing on the same video you extracted the object from??).
Haar Cascade 将需要您提供大量的训练数据,并且对于方向的任何调整都会很差。
您最好的选择可能是将模板匹配与类似于 opencv 中的 camshift 的算法结合起来(5 ,7MB PDF),以及卡车是否仍在图像中的概率模型(您必须弄清楚这一点)。
A Haar Cascade is going to require significant training data on your part, and will be poor for any adjustments in orientation.
Your best bet might be to combine template matching with an algorithm similar to camshift in opencv (5,7MB PDF), along with a probabilistic model (you'll have to figure this one out) of whether the truck is still in the image.
使用 yolo ,在深度排序的帮助下,你的工作将很容易完成,这是值得的
use yolo it will worth it with help of deep sort your work will done quite easily