图像处理:什么是遮挡?

发布于 2024-08-31 22:30:31 字数 85 浏览 5 评论 0 原文

我正在开发一个图像处理项目,在许多科学论文中都遇到过“遮挡”这个词,遮挡在图像处理中意味着什么?字典只是给出了一般的定义。谁能使用图像作为上下文来描述它们?

I'm developing an image processing project and I come across the word occlusion in many scientific papers, what do occlusions mean in the context of image processing? The dictionary is only giving a general definition. Can anyone describe them using an image as a context?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

海未深 2024-09-07 22:30:32

遮挡意味着您想看到某些内容,但由于传感器设置的某些属性或某些事件而无法看到。
它的具体表现方式或您处理问题的方式将因当前的问题而异

一些示例:

如果您正在开发一个跟踪对象(人、汽车等)的系统,那么如果您正在跟踪的对象被另一个对象隐藏(遮挡),就会发生遮挡。就像两个人擦肩而过,或者一辆汽车在桥下行驶。
这种情况下的问题是当一个对象消失并再次出现时你该怎么做。

如果您使用的是测距相机,那么遮挡就是您没有任何信息的区域。一些激光测距相机的工作原理是将激光束发射到您正在检查的表面上,然后设置相机来识别结果图像中激光的影响点。这给出了该点的 3D 坐标。然而,由于相机和激光不一定对齐,因此在检查表面上可能存在相机可以看到但激光无法击中(遮挡)的点。
这里的问题更多的是传感器设置的问题。

如果场景的某些部分只能由两个摄像机之一看到,那么在立体成像中也会发生同样的情况。显然无法从这些点收集范围数据。

可能还有更多的例子。

如果您指定您的问题,那么也许我们可以定义在这种情况下什么是遮挡,以及它会带来什么问题

Occlusion means that there is something you want to see, but can't due to some property of your sensor setup, or some event.
Exactly how it manifests itself or how you deal with the problem will vary due to the problem at hand.

Some examples:

If you are developing a system which tracks objects (people, cars, ...) then occlusion occurs if an object you are tracking is hidden (occluded) by another object. Like two persons walking past each other, or a car that drives under a bridge.
The problem in this case is what you do when an object disappears and reappears again.

If you are using a range camera, then occlusion is areas where you do not have any information. Some laser range cameras works by transmitting a laser beam onto the surface you are examining and then having a camera setup which identifies the point of impact of that laser in the resulting image. That gives the 3D-coordinates of that point. However, since the camera and laser is not necessarily aligned there can be points on the examined surface which the camera can see but the laser can not hit (occlusion).
The problem here is more a matter of sensor setup.

The same can occur in stereo imaging if there are parts of the scene which are only seen by one of the two cameras. No range data can obviously be collected from these points.

There are probably more examples.

If you specify your problem, then maybe we can define what occlusion is in that case, and what problems it entails

随风而去 2024-09-07 22:30:32

遮挡问题是计算机视觉普遍困难的主要原因之一。具体来说,这在对象跟踪中问题更大。参见下图:

在此处输入图像描述

请注意,这位女士的脸在帧 05190519 中不完全可见0835 与帧 0005 中的脸部相对。


这是另一张照片,该男子的脸部在所有三个画面中都部分隐藏

部分遮挡


请注意下图中红色和红色情侣的跟踪情况。由于遮挡(即被前面的另一个人部分隐藏),绿色边界框在中间帧中丢失,但当它们变得(几乎 )完全可见。

输入图像描述这里

图片提供:斯坦福大学,南加州大学

The problem of occlusion is one of the main reasons why computer vision is hard in general. Specifically, this is much more problematic in Object Tracking. See the below figures:

enter image description here

Notice, how the lady's face is not completely visible in frames 0519 & 0835 as opposed to the face in frame 0005.


And here's one more picture where the face of the man is partially hidden in all three frames.

partial occlusion


Notice in the below image how the tracking of the couple in red & green bounding box is lost in the middle frame due to occlusion (i.e. partially hidden by another person in front of them) but correctly tracked in the last frame when they become (almost) completely visible.

enter image description here

Picture courtesy: Stanford, USC

心是晴朗的。 2024-09-07 22:30:32

遮挡是指遮挡我们的视线。在这里显示的图像中,我们可以很容易地看到前排的人。但第二行是部分可见的,而第三行则不太可见。这里,我们说第二行被第一行部分遮挡,第三行被第一行和第二行遮挡。
当物体很多时,我们可以在教室(学生排成一排)、交通路口(等待信号的车辆)、森林(树木和植物)等处看到这种遮挡。
输入图像描述这里

Occlusion is the one which blocks our view. In the image shown here, we can easily see the people in the front row. But the second row is partly visible and third row is much less visible. Here, we say that second row is partly occluded by first row, and third row is occluded by first and second rows.
We can see such occlusions in class rooms (students sitting in rows), traffic junctions (vehicles waiting for signal), forests (trees and plants), etc., when there are a lot of objects.
enter image description here

白色秋天 2024-09-07 22:30:32

除了已经说过的内容之外,我想添加以下内容:

  • 对于对象跟踪,处理遮挡的一个重要部分是编写一个有效的成本函数,它将能够区分被遮挡的对象和遮挡它的对象。如果成本函数不好,对象实例(id)可能会交换,并且对象将被错误地跟踪。成本函数的编写方式有很多种,有些方法使用 CNN[1],而有些方法更喜欢拥有更多控制和聚合功能[2]。 CNN 模型的缺点是,如果您正在跟踪训练集中的对象,而训练集中存在不存在的对象,并且第一个对象被遮挡,则跟踪器可能会锁定错误的对象,并且可能或可能永远无法恢复。这是一个视频展示了这一点。聚合特征的缺点是您必须手动设计成本函数,这可能需要时间,有时还需要高级数学知识。
  • 在密集立体视觉重建的情况下,当用左相机看到某个区域而用右相机看不到该区域时,就会发生遮挡(反之亦然)。在视差图中,该遮挡区域显示为黑色(因为该区域中的相应像素在其他图像中没有等效像素)。一些技术使用所谓的背景填充算法,该算法用来自背景的像素填充被遮挡的黑色区域。其他重建方法只是让那些在视差图中没有值的像素,因为来自背景填充方法的像素在这些区域中可能是不正确的。下面是使用密集立体方法获得的 3D 投影点。这些点向右旋转了一点(在 3D 空间中)。在所呈现的场景中,视差图中被遮挡的值未重建(黑色),由于这个原因,我们在 3D 图像中看到人身后的黑色“阴影”。

    在此处输入图像描述

Additionally to what has been said I want to add the following:

  • For Object Tracking, an essential part in dealing with occlusions is writing an efficient cost function, which will be able to discriminate between the occluded object and the object that is occluding it. If the cost function is not ok, the object instances (ids) may swap and the object will be incorrectly tracked. There are numerous ways in which cost functions can be written some methods use CNNs[1] while some prefer to have more control and aggregate features[2]. The disadvantage of CNN models is that in case you are tracking objects that are in the training set in the presence of objects which are not in the training set, and the first ones get occluded, the tracker can latch onto the wrong object and may or may never recover. Here is a video showing this. The disadvantage of aggregate features is that you have to manually engineer the cost function, and this can take time and sometimes knowledge of advanced mathematics.
  • In the case of dense Stereo Vision reconstruction, occlusion happens when a region is seen with the left camera and not seen with the right(or vice versa). In the disparity map this occluded region appears black (because the corresponding pixels in that region have no equivalent in the other image). Some techniques use the so called background filling algorithms which fill the occluded black region with pixels coming from the background. Other reconstruction methods simply let those pixels with no values in the disparity map, because the pixels coming from the background filling method may be incorrect in those regions. Bellow you have the 3D projected points obtained using a dense stereo method. The points were rotated a bit to the right(in the 3D space). In the presented scenario the values in the disparity map which are occluded are left unreconstructed (with black) and due to this reason in the 3D image we see that black "shadow" behind the person.

    enter image description here

清旖 2024-09-07 22:30:32

由于其他答案已经很好地解释了遮挡,我仅对此进行补充。基本上,我们和计算机之间存在语义鸿沟。

对于 RGB 图像中的每种颜色,计算机实际上将每个图像视为值序列,通常在 0-255 范围内。对于图像中的每个点,这些值以 (row, col) 的形式进行索引。因此,如果物体在相机上改变其位置,其中物体的某些方面隐藏(让人的手不显示),计算机将看到不同的数字(或边缘或任何其他特征),因此这将改变计算机算法检测、识别或跟踪物体。

As the other answers have explained the occlusion well, I will only add to that. Basically, there is semantic gap between us and the computers.

Computer actually see every image as the sequence of values, typically in the range 0-255, for every color in RGB Image. These values are indexed in the form of (row, col) for every point in the image. So if the objects change its position w.r.t the camera where some aspect of the object hides (lets hands of a person are not shown), computer will see different numbers (or edges or any other features) so this will change for the computer algorithm to detect, recognize or track the object.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文