物体检测 +分割

发布于 2024-12-01 22:40:17 字数 1083 浏览 2 评论 0原文

我正在尝试找到一种复杂性可接受的有效方法来

检测图像中的对象，以便我可以将其与周围环境隔离，
将该对象分段到其子部分并标记它们，以便我可以随意获取它们

已经三周了自从我进入图像处理领域以来，我读到了很多算法（筛选、蛇、更多蛇、傅立叶相关等）和启发式算法，我不知道从哪里开始，也不知道哪一个是“最好的”为了什么我正在努力实现。考虑到感兴趣的图像数据集是一个相当大的数据集，我什至不知道是否应该使用 OpenCV 中实现的某种算法，或者是否应该实现自己的算法。

总结：

我应该关注哪种方法？为什么？
我应该使用 OpenCV 来做这类事情还是有其他“更好”的替代方案？

先感谢您。

编辑——有关数据集的更多信息

每个数据集由 80K 个具有相同

概念的产品图像组成，例如 T 恤、手表、鞋子
尺寸
方向（其中 90%）
背景（其中 95%）

显然，除了产品本身之外，每个数据集中的所有图片看起来几乎相同。为了让事情更清楚一点，让我们只考虑“观看数据集”：

集中的所有图片看起来几乎完全像这样：

“在此处输入图像描述”

（再次强调，除了手表本身之外）。我想把表带和表盘拆下来。问题是手表的款式和形状有很多不同。从我到目前为止所读到的内容来看，我认为我需要一个允许弯曲和拉伸的模板算法，以便能够匹配不同风格的表带和表盘。

与其创建三个不同的模板（表带上部、表带下部、表盘），不如只创建一个模板并将其分成 3 个部分。这样，我就有足够的信心，每个部分都按照预期相对于彼此进行检测，例如，不会在表带下部下方检测到表盘。

从我遇到的所有算法/方法来看，主动形状/外观模型似乎是最有前途的。不幸的是，我还没有找到一种下降实现，并且我没有足够的信心认为这是最好的方法，因此我无法自己编写一个。

如果有人能指出我应该真正寻找什么（算法/启发式/库/等），我将不胜感激。如果您再次认为我的描述有点模糊，请随时询问更详细的描述。

原文

I 'm trying to find an efficient way of acceptable complexity to

detect an object in an image so I can isolate it from its surroundings
segment that object to its sub-parts and label them so I can then fetch them at will

It's been 3 weeks since I entered the image processing world and I've read about so many algorithms (sift, snakes, more snakes, fourier-related, etc.), and heuristics that I don't know where to start and which one is "best" for what I'm trying to achieve. Having in mind that the image dataset in interest is a pretty large one, I don't even know if I should use some algorithm implemented in OpenCV or if I should implement one my own.

Summarize:

Which methodology should I focus on? Why?
Should I use OpenCV for that kind of stuff or is there some other 'better' alternative?

Thank you in advance.

EDIT -- More info regarding the datasets

Each dataset consists of 80K images of products sharing the same

concept e.g. t-shirts, watches, shoes
size
orientation (90% of them)
background (95% of them)

All pictures in each datasets look almost identical apart from the product itself, apparently. To make things a little more clear, let's consider only the 'watch dataset':

All the pictures in the set look almost exactly like this:

enter image description here

(again, apart form the watch itself). I want to extract the strap and the dial. The thing is that there are lots of different watch styles and therefore shapes. From what I've read so far, I think I need a template algorithm that allows bending and stretching so as to be able to match straps and dials of different styles.

Instead of creating three distinct templates (upper part of strap, lower part of strap, dial), it would be reasonable to create only one and segment it into 3 parts. That way, I would be confident enough that each part was detected with respect to each other as intended to e.g. the dial would not be detected below the lower part of the strap.

From all the algorithms/methodologies I've encountered, active shape|appearance model seem to be the most promising ones. Unfortunately, I haven't managed to find a descent implementation and I'm not confident enough that that's the best approach so as to go ahead and write one myself.

If anyone could point out what I should be really looking for (algorithm/heuristic/library/etc.), I would be more than grateful. If again you think my description was a bit vague, feel free to ask for a more detailed one.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

橙味迷妹 2024-12-08 22:40:17

根据您所说的，以下是乍一看会出现的一些内容：

最简单的方法是对图像进行二值化并使用 OpenCV 或 CvBlob 库进行连接组件。对于具有非复杂背景的简单图像，这通常会产生对象
但是，查看示例图像，基于纹理的分割技术可能效果更好 - 表盘、表带和背景都是明智的纹理/粗糙度的变化，这可能是分离它们的理想方法。
可以通过特征变换轻松找到部分的粗糙度（在SO，检查那里提供的研究论文的链接），然后可以将均值平移滤波器应用于特征值的输出转换。这将根据纹理给出清晰分离的区域。金字塔均值平移和通过 SVD 求特征值都是在 OpenCV 中实现的，因此除非您可以优化自己的代码，否则就速度和效率而言，使用内置函数（如果存在）会更好（也更容易）。