Google Shopper 中的图像识别是如何工作的?
我对这个软件的运行效果(和速度)感到惊讶。我在昏暗的灯光下将手机摄像头悬停在书籍封面的一小块区域上,Google Shopper 只需几秒钟就可以识别它。这几乎是神奇的。有谁知道它是如何工作的?
I am amazed at how well (and fast) this software works. I hovered my phone's camera over a small area of a book cover in dim light and it only took a couple of seconds for Google Shopper to identify it. It's almost magical. Does anyone know how it works?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我不知道 Google Shopper实际上是如何工作的。但它可以像这样工作:
Google Shopper 还可以发送整个图片,此时 Google 可以使用功能更强大的处理器来处理图像处理数据,这意味着它可以使用更复杂的预处理(我选择上述步骤非常简单,以便可以在智能手机上使用)。
无论如何,一般步骤很可能是(1)提取尺度和旋转不变特征,(2)将该特征向量与预先计算的特征库进行匹配。
I have no idea how Google Shopper actually works. But it could work like this:
Google Shopper could also send the entire picture, at which point Google could use considerably more powerful processors to crunch on the image processing data, which means it could use more sophisticated preprocessing (I've chosen the steps above to be so easy as to be doable on smartphones).
Anyway, the general steps are very likely to be (1) extract scale and rotation-invariant features, (2) match that feature vector to a library of pre-computed features.
无论如何,模式识别/机器学习方法通常基于:
使用特征和文本对数据库进行搜索,以找到最接近的相关产品。
图像也可能被切割成子图像,因为算法经常在图像上找到特定的徽标。
在我看来,图像特征被发送到不同模式分类器(能够使用特征向量作为输入来预测“类”的算法),以便识别徽标,然后,产品本身。
使用这种方法,它可以是:本地、远程或混合。如果是本地的,则所有处理都在设备上进行,仅将“特征向量”和“文本”发送到数据库所在的服务器。如果是远程,整个图像将发送到服务器。如果是混合的(我认为这是最有可能的),部分在本地执行,部分在服务器上执行。
另一个有趣的软件是 Google Googles,它使用 CBIR(基于内容的图像检索)来搜索与智能手机拍摄的照片相关的其他图像。它与 Shopper 解决的问题相关。
In any case, the Pattern Recognition/Machine Learning methods often are based on:
Perform a search on a database using the features and the text in order to find the closest related product.
It is also likely that the image is also cuted into subimages, since the algorithm often finds a specific logo on the image.
In my opinion, the image features are send to different pattern classifiers (algorithms that are able to predict a "class" using as input a feature vector), in order to recognize logos and, afterwards, the product itself.
Using this approach, it can be: local, remote or mixed. If local, all processing is carried out on the device, and just the "feature vector" and "text" are sent to a server where the database is. If remote, the whole image goes to the server. If mixed (I think this is the most probable one), partially executed locally and partially at the server.
Another interesting software is the Google Googles, that uses CBIR (content-based image retrieval) in order to search for other images that are related to the picture taken by the smartphone. It is related to the problem that is addressed by Shopper.
模式识别。
Pattern Recognition.