如何将商店图像中的二维坐标映射到商店的实际货架上?
我们需要建立一个车间模型,在其中我们可以将像素坐标(x,y)与 将相机图像映射到商店 3D 空间中的实际物体。作为生成此类模型的来源的相机图像会受到鱼眼失真的影响。因此,直线在相机图像中实际上显示为曲线,并且墙壁似乎以不完全直角相互交汇。
我们将该区域细分为多边形。图像上的每个多边形指的是一个特定区域,例如货架、展示区域、结帐柜台等。通过映射落在每个多边形中的像素,我们希望将其关联到属于与该区域相对应的货架。
有什么想法如何去做吗?
以下是商店的示例图像,其中标记了一些多边形:
编辑: 我们并不是要找出 3D 坐标,我们只需要知道任何多边形映射到哪个架子。因此,如果用户单击多边形,我们可以说他单击了哪个架子。
我们能够对如图所示的大多边形进行上述管理,但是远离相机的架子可以小到几个像素,因此我们需要某种概率结果,说明用户是否单击了 (x ,y) 他尝试单击书架 A 的概率是多少,或者他尝试单击书架 B 的概率是多少,依此类推。
基本上,我们正在寻找的是一个概率函数,当点击 2D 图像上的小多边形(或像素)时,它将返回点击附近物体的概率。
EDIT2:< /强> 从示例图像中看不出来的一件事是,多边形尺寸可能非常小(小至几个像素),而多边形又可能彼此非常接近。
此外,用例是商店中的顾客从一个货架上挑选产品。应用程序用户将单击图像中他认为从中拾取产品的点。现在,由于多边形如此小且如此接近,用户只能猜测确切的拾取点,因此我们最多只能知道它可能是靠近点击点的 3-4 个多边形中的任何一个。那么问题是如何计算给定点击的这 3-4 个多边形的概率?
正如此处建议的,点击距多边形中心的距离及其面积可以作为计算此概率的参数,我想知道是否有算法可以这样做。
We need to build a model of the shop floor in which we can relate pixel coordinates(x, y) from
camera images to the actual objects in the 3D space of the store. The camera images, which will act as sources for generating such a model, suffer from fish-eye distortions. Hence straight lines actually appear as curves in the camera images and the walls appear to meet each other at not exactly right angles.
We are sub-dividing the region into polygons. Each polygon on the image refers to a particular region such as a shelf, display area, checkout counter etc. By mapping the pixels that fall in each polygon, we want to relate it as belonging to the shelf corresponding to that region.
Any ideas how to go about it?
Following is a sample image of the store with some polygons marked:
EDIT:
We are not looking to find out the 3D coordinates, we just need to know which shelf is any polygon mapped to. So if the user clicks on a polygon, we can say he clicked on which shelf.
We are able to manage the above for big polygons like the ones shown in the image, but the shelves away from the camera can be as small as a few pixels so we need some kind of a probabilistic result saying if the user clicked at (x,y) what is the probability that he was trying to click on Shelf-A or what is the probability that he was trying to click on Shelf-B and so on.
Basically, what we are looking for is a probability function which would return the probabilities of click on nearby objects when a small polygon(or a pixel) is clicked on the 2D image.
EDIT2:
One thing which is not apparent from the sample image is that the polygon size could be really small(as small as a few pixels) and polygons in turn could be really close to each other.
Moreover, the use case is that a customer in the store picks a product from one of the shelves. The application user would click on a point in the image from which he thinks the products is picked up. Now since the polygons are so small and so close, the user can only guess the exact point of pickup, so we can only know at best that it could be any one of the 3-4 polygons close to the point of click. So the question is how to calculate probabilities for these 3-4 polygons given the click?
As suggested here distance of the click from the center of polygon and its area could be parameters in calculation of this probability, what I am wondering is if there is algorithm to do so.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我假设您有从多边形到架子名称的映射。例如,作为对的列表(多边形、架子名称)。如果相机固定不动,你可以手工制作一次。那么你的问题只是找到一个点属于哪个多边形。
如果您使用OpenCV,那么您可以使用其
PointPolygonTest
< /a> 函数。否则你可以自己写一个类似的函数。例如,参见光线投射算法。然后浏览列表,直到找到该点所在的多边形。为了进一步优化程序,您可以预先计算多边形的范围。范围允许您快速判断该点何时绝对不在多边形内,并仅考虑剩余的多边形。但由于图像中的多边形很少,我不会打扰。
只需运行一个实验,尝试单击单个突出显示的像素,积累一些有关操作员实际单击位置的统计数据。一旦掌握了这一点,就可以轻松预测对象外点击的数量以及它们可能偏离的程度。
如果没有对完全相同类型的人、相同的使用条件和您将使用的相同指点设备进行这样的实验,您就无法真正判断点击次数会减少多少。我相信很多人都是狙击手,如果鼠标好的话,他们可以很好地看到图像。如果他们被迫使用触摸界面或其他一些指点设备,精度可能会较低。
I assume you have a mapping from polygon to shelf name. For example, as a list of pairs (polygon, shelf name). You can make it by hand once, if the cameras are fixed and don't move. Then your problem is only finding which polygon does a point belong to.
If you use OpenCV, then you can use its
PointPolygonTest
function. Otherwise you may write a similar function yourself. See, for example, a Ray casting algorithm. Then look through the list until you find a polygon which the point lies withing.To further optimize the program you may precalculate polygons' extents. An extents allows you to quickly say when the point is definitely not inside the polygon, and consider only the remaining polygons. But with so few polygons as you have in the image, I would not bother.
Just run an experiment, try to click a single highlighted pixel, accumulate some statistics on where the operator does actually clicks. Once you have this, it's easy to predict the number of out-of-object clicks and how far they are likely to be off.
Without such experiment with exactly the same kind of person, the same usage conditions and the same pointing device you are going to use, you cannot really tell how much off the clicks are going to be. I believe that many people are sniper clickers if the mouse is good and they can see the image well. If they are forced to use touch interface or some other pointing device, the precision may be lower.
很少有评论
可以通过对图像进行一些转换来纠正鱼眼,例如参见这个 一些资源的页面,包括 panotools
仅获取 3D 坐标且来自一台相机的图像是不够的,还需要其他信息
在来自不同摄像机的同一场景的两张图像上标记同一点可以为您提供完整的 3D 信息(您确实需要知道每个摄像机相对于彼此的位置)
如果您正在寻找工具来执行此操作它,请参阅 https:// /superuser.com/questions/30053/is-there-any-free-open-source-software-that-converts-photos-to-3d-models
编辑
更新问题后,假设已经存在一组多边形,并且您希望消除用户错误(或提高精度),您可能
尝试通过计算接近单击的多边形的权重中心的距离来猜测所需的单击多边形
使用视觉提示(闪烁选定的多边形并需要第二次单击)
收集有关错误的统计信息以及某些需要验证的多边形
Few comments
fish eye can be corrected by applying some transformations to the image, see for example this page for some resources including panotools
to get the 3D coordinates only and image from one camera is not enough, additional info is necessary
marking a same point on two images of the same scene from different cameras can give you full 3D info (you do need to know position of each camera relative to each other)
if you are looking for tools to do it, see https://superuser.com/questions/30053/is-there-any-free-open-source-software-that-converts-photos-to-3d-models
EDIT
After update to the question, assuming there already exist a set of polygons and you want to eliminate user errors (or improve precision) you might
try to guess the desired click polygon by calculating distance to centre of weight of polygons close to click
use visual cues (flash the polygon selected and require second click)
collect statistics on errors and for certain polygons require validation
您想要的是空间填充曲线,例如 Z 曲线或希尔伯特曲线。空间填充曲线将平面细分为更小的图块,并将二维的复杂性降低为一维,每个图块都有一个新的顺序。您的问题可能有趣的是,希尔伯曲线不是以二进制顺序遍历平面,而是使用格雷码,以便每个图块与其他图块有 1 位不同。这样可以轻松确定用户是否单击了这个或那个对象。
What you want is a space-filling-curce for example a Z-Curce or a Hilbert-Curve. A space-filling-curve sub-divide the plane into smaller tiles and reduce the complexity of 2-Dimensions into 1-Dimension in a way that each tile get's a new order. What might interessting for your problem is that the Hilber-Curve traverse the plane not in binary order but it use a gray code so that every tile is different in 1-Bit from the other tiles. That makes it easy to decide whether the user has clicked this or that object.