如何从图像中识别特定的物理对象?
我的目标是让一个人用手机拍摄我们大学校园内的当地地标(建筑物或其他(例如凉亭、雕像等))的照片,并能够识别该地标并告诉他们什么这是。
例如,他们四处走动,看到一座带有金属圆顶的大型建筑。他们不知道那是什么,但看起来很有趣,所以他们拍了一张照片,应用程序告诉他们这是篮球中心(以及其他相关信息)。
我在这个特定领域的有限知识使我想到使用神经网络并训练程序来识别特定的地方。如果是这种情况,请也给我这个选项的资源,因为我对神经网络的了解程度是,如果它们经过训练,它们可以用来识别事物。 :)
我知道 OpenCV 库,但由于我不是 C 开发人员,我想知道在开始之前我是否需要走这条路。我主要使用 Java 工作,但我并不反对亲自动手。
谢谢!
My goal is to be able to have a person with a mobile phone snap a picture of a local landmark (building or otherwise (ex. gazebo, statue, etc)) on our college campus and be able to identify the landmark and tell them what it is.
For instance, they are walking around and they see a large building with a metal dome. They don't know what it is, but it looks interesting, so they snap a picture and the app tells them that it's the basketball center (and other relevant info).
My limited knowledge in this particular field led me to think of using neural networks and training the program to recognize particular places. If this is the case, please also give me resources for this option, as the extent of my knowledge of NN is that they can be used to recognize things if they are trained. :)
I know of the OpenCV library, but as I am not a C developer, I'd like to know if I need to go down that road before I start. I primarily work in Java, but I'm not opposed to getting my hands dirty.
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这是对你原来问题的回应。
最好的资源是 O'Reilly 的书 学习 OpenCV
您可以阅读Google 图书上的东西是免费的,它使用 C 和 OpenCV。您可以使用 python 或 Java 来适合您的工作。
OpenCV 库包括 haar 训练和训练它进行人脸/文本识别的示例程序。之后你基本上就必须弄清楚事情。
我刚刚偶然发现的另一个有用的资源是英特尔的 OpenCV 参考手册。
所以,祝你好运!
This is in response to your original question.
The best resource would be the O'Reilly book Learning OpenCV
You can read the thing on Google books for free and it uses C along with OpenCV. You can use python or Java to suit your work.
The OpenCV library includes haar training and sample programs on training it for face/text recognition. After that you'll basically have to figure things out.
Another useful resource I just stumbled upon is Intel's reference manual for OpenCV.
So, good luck!
使用第二种方法是更容易的方法,因为您知道 GPS 坐标的位置,并且知道您面对的方向(因为大多数移动设备都有集成的指南针和加速度计)。这已经被几个增强现实浏览器使用 - 如果你使用 Android,你可能想看看“Layar”...
更用户友好的方式是通过摄影,因为不是每部手机都有 GPS,他们总是需要转动它首先...
首先,您需要了解建筑物最显着的结构和特征。 OpenCV 有一些方法可以实现这一点。特征提取是图像处理中的一个大课题。您可能应该提取图像上的边缘,获取突出的特征/点,并将它们与您拥有的所有建筑物的特征数据库进行比较。
您可以使用神经网络进行训练,但您仍然需要大量参考图片来提取数据以获得学习过程。
(为了与其他对象的整个数据库进行比较,您甚至可能想查看服务器端计算,而不是在手机上执行所有这些操作)
希望有所帮助...
well using your second method is the much easier one, since you know where from the GPS coordinates and you know which way you're facing (since most mobile devices have an integrated compass and accelerometer). This is used by several Augmented Reality browsers already - if you use Android you might wanna have a look at "Layar"...
The more user friendly way would be via photography, since not every phone has GPS and they always need to turn it on first...
First of all you'd need to get the most salient structures and features of the buildings. OpenCV has some methods for that. Feature extraction is a big topic in image processing. You should probably extract edges on your image, take the prominent features/points and compare these to a database of the features of all the buildings you have.
You could use a neural network for training, but you'd still need a lot of reference pictures to extract data from to get a learning process.
(For comparing with the whole database of other objects you might even wanna have a look at a server-side calculation instead of doing all this on the phone)
Hope that helps...
对于计算机视觉经验很少的人来说,将其作为计算机视觉任务来完成是非常困难的 - 10 年前,这是一个完全未解决的问题。但首先要说明的是:
神经网络(或者更确切地说,具有反向传播式训练的神经网络)已经相当陈旧,不再是首选方法。随机森林很受欢迎,主要是因为它们非常灵活,相当容易实现,并且平均性能不比其他分类方法差。 Criminisi et al 2011 是标准论文。 http://research.microsoft.com/pubs/155552/decisionForests_MSR_TR_2011_114.pdf
上次我检查文献时(几年前),似乎有两个很好的图像特征首选。 SIFT 或稀疏 Haar 小波。
看看 Criminisi 等人 2008 (http://research.microsoft.com/pubs/ 72423/Criminisi_bmvc2008.pdf)用于基于随机森林和哈尔小波的对象识别系统。
Fergus 等人的另一种方法。 2007,(http://cs.nyu.edu/~fergus/papers/fergus_ijcv。 pdf)使用一个简单的图像补丁模型,通过贝叶斯网络连接在一起。
OpenCV 可能是开始查找现有代码的好地方。 Matlab 也声称对这些任务有很好的支持。
Doing this as a computer vision task would be very difficult for someone with little computer vision experience - 10 years ago it was an entirely unsolved problem. But to get you started:
Neural networks (or properly, NN with back-propagation-style training) are rather old hat, and no longer the method of choice. Random forests are popular, mostly because they quite flexible, reasonably easy to implement, and have on-average no worse performance that the other classification methods around. Criminisi et al 2011 is the standard paper. http://research.microsoft.com/pubs/155552/decisionForests_MSR_TR_2011_114.pdf
Last time I checked the literature (a few years ago now) there appeared to be two good first choices of image feature. SIFT or sparse Haar wavelets.
Have a look at Criminisi et al 2008 (http://research.microsoft.com/pubs/72423/Criminisi_bmvc2008.pdf) for a random forest and Haar wavelet based object recognition system.
An alternative approach from Fergus et al. 2007, (http://cs.nyu.edu/~fergus/papers/fergus_ijcv.pdf) uses a simple image patch model tied together using a Bayesian network.
OpenCV is probably as good place as any to start to find existing code. Matlab also claims to have good support for these tasks.