我应该使用什么样的描述符来检测海豹幼崽?
我有一个项目,用于在从海滩拍摄的航拍图像中检测和计数海豹幼崽(动物)。与棕色且体型较大的成年海豹相比,海豹幼崽呈黑色且较小。
一些海豹幼崽重叠/部分被遮挡。海滩颜色接近黄色,但有一些黑色岩石增加了检测难度。
哪种描述符最适合我的项目? HOG、SIFT、Haar 式特征?
我要求这个问题的理论部分。我认为要实现我的项目,第一步应该是选择最能代表对象的正确描述符,然后(组合几个弱特征,没有必要?)使用机器学习方法(如boosting/SVM/neural_network)训练分类器,我是对的?
示例图片:
I have a project to detect and count seal cubs (the animal) in an aerial image which is taken from beach. The seal cubs are black and small compared to adult seals which are brown and large.
Some seal cubs are overlapped/partly occluded. The beach color is near yellow however there're some black rocks that increase the detection difficulty.
What kind of descriptor is most suitable for my project? HOG, SIFT, Haar-like features?
I'm asking for the theory part of this problem. I think to implement my project, the first step should be choose correct descriptor that can most represent the object, then (combine several weak features, not necessary?) train a classifier using machine learning method like boosting/SVM/neural_network, am I right?
Sample image:
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
发布评论
评论(3)
计算机视觉算法的准确性似乎在很大程度上依赖于针对特定问题进行微调的能力。如果您可以对提交给算法的图片做出假设,例如所有这些图片都是类似海滩场景中海豹的航拍图像,那么您就可以利用这一点。我想说,在尝试过于关注局部特征之前,您可能需要尝试诸如分水岭分割之类的方法并计算非背景片段的数量。 Watershed 提供了一个称为“标记”的便捷框架,用于合并有关输入的先验知识,以区分“背景”和“前景”部分。
像这样的方法可能比局部特征更容易,也可能更准确。根据我的经验,我无法使用 SIFT 和 SURF 特征从有机主题(例如面部或动物)中提取和匹配大量有意义的特征。对我来说,它们往往更适合拍摄具有多个角度的房间或建筑物的图片。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
我不确定我是否同意选择正确的描述符是正确的起点。一个基本问题是所有物体的形状都相似。每只动物体内也存在很大的梯度。姿势的复杂性是另一个问题。我会将问题分解为两个更简单的步骤:
1. 独特的对象检测(边缘检测、分水岭、图形切割等)。类似于“计数血细胞”的问题。
2. 基于颜色和面积的对象分类(标准化为相机视角)。计算每个对象中“黄色”彩色像素和“黑色”彩色像素的分数,并将这些值与对象大小一起用作对象分类器的输入(神经网络在这里是一个有趣的解决方案!)。
这是一个相当混乱的场景,所以我预计这两种算法都需要一些微调。如果您的要求允许一定程度的分析师交互,请提供一些滑块,以便分析师可以调整算法中的每个阈值。
I'm not sure I agree that selecting the right descriptor is the right place to start. A fundamental issue is that all the objects are similar in shape. There are also substantial gradients within each animal. The complexity of poses is another issue. I would break the problem into two more simple steps:
1. Unique object detection (edge detection, watershed, graph cut , etc). Something like the "count blood cells" problem.
2. Object classification based on color and area (normalized to camera perspective). Compute the fractional amount of "yellow" colored pixels and "black" colored pixels in each object and use those values along with the object size as inputs to an object classifier (neural networks are a fun solution here!).
It is a fairly cluttered scene, so I would expect both of these algorithms to require some fine-tuning. If your requirements allow some level of analyst interaction, provide some sliders so the analyst can adjust each of the thresholds in your algorithms.