EverNote OCR 功能?
我下载了 EverNote API Xcode 项目,但我对 OCR 功能有疑问。借助他们的 OCR 服务,我可以拍照并在 UILabel 中显示提取的文本吗?或者它不能那样工作吗? 或者提取出来的文字不是给我看的,只是为了照片的搜索功能?
有没有人有过这方面的经验或任何想法?
谢谢!
I downloaded the EverNote API Xcode Project but I have a question regarding the OCR feature. With their OCR service, can I take a picture and show the extracted text in a UILabel or does it not work like that?
Or is the text that is extracted not shown to me but only is for the search function of photos?
Has anyone ever had any experience with this or any ideas?
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
事实并非如此。 Evernote 并没有真正实现纯粹意义上的“OCR”,将文档图像转换为连贯的文本段落。
Evernote 的识别 XML(您可以通过@DaveDeLong 上面展示的技术检索)作为搜索索引最有用;该服务将为您提供矩形集和可能的单词/文本片段集,并附有概率分数。这为匹配搜索词奠定了良好的基础,但对于构建表示文档的单个字符串来说却很糟糕。
(我知道这个答案迟了四年,但戴夫的精彩描述并没有真正解决这个哲学区别,如果你尝试实际做你在问题中建议的事情,你会遇到这种区别。)
It doesn't really work like that. Evernote doesn't really do "OCR" in the pure sense of turning document images into coherent paragraphs of text.
Evernote's recognition XML (which you can retrieve after via the technique that @DaveDeLong shows above) is most useful as an index to search against; the service will provide you sets of rectangles and sets of possible words/text fragments with probability scores attached. This makes a great basis for matching search terms, but a terrible one for constructing a single string that represents the document.
(I know this answer is like 4 years late, but Dave's excellent description doesn't really address this philosophical distinction that you'll run up against if you try to actually do what you were suggesting in the question.)
是的,但看起来这需要一些工作。
当您获得与图像相对应的
EDAMResource
时,它有一个名为recognition
的属性,该属性返回一个EDAMData
对象,该对象包含定义图像的 XML识别信息。例如,我将此图像附加到注释中:我检查了
识别
附加到相应EDAMResource
对象的信息,并发现:我找到的 xml在 Pastie.org 上,因为它太大,无法容纳答案
如您所见,这里有很多信息。 XML 在API 文档中定义,因此您可以在此处解析 XML 并自行提取相关信息。幸运的是,XML 的结构非常简单(您可以在几分钟内编写一个解析器)。困难的部分是弄清楚你想要使用哪些部分。
Yes, but it looks like it's going to be a bit of work.
When you get an
EDAMResource
that corresponds to an image, it has a property calledrecognition
that returns anEDAMData
object that contains the XML that defines the recognition info. For example, I attached this image to a note:I inspected the
recognition
info that was attached to the correspondingEDAMResource
object, and found this:the xml i found on pastie.org, because it's too big to fit in an answer
As you can see, there's a LOT of information here. The XML is defined in the API documentation, so this would be where you parse the XML and extract the relevant information yourself. Fortunately, the structure of the XML is quite simple (you could write a parser in a few minutes). The hard part will be to figure out what parts you want to use.