如何突出显示图像中的单词?
我希望能够在用户搜索文档图像中的某个单词时突出显示该单词。就像 Google 图书一样 此处。
据我所知,Tesseract和其他开源OCR程序不支持这种功能,所以有人知道如何完成它吗?
I'd like to be able to highlight a word in an image of a document when the user searches for that word. Exactly like Google Books does
here.
As far as I know, Tesseract and other open source OCR programs don't support this sort of function, so does anyone have any ideas how it might be done?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
是的,他们“支持”它。有点像。
他们会给你一个矩形,告诉你这个词在哪里。使用它,使用 颜色混合模式(例如,保持亮度不变,仅改变色度)。这适用于黑白和灰度图像,大多数书籍都是如此,并且对于大多数彩色字体也足够了(彩色背景中的字体除外)。解决这个问题的方法是反转颜色而不是突出显示它们,这在许多应用程序中都是这样做的(我想到的是 Foxit Reader)。
Yes they "support" it. Sort of.
They give you a rectangle that tells you where the word is. Using that, fill said rectangle with the color of your choice on the image using a color blending mode (e.g., keep the luma intact and just alter the chroma). This works well with B/W and grayscale images, which most books are, and is sufficient for most colored fonts too (except those in a colored background). A solution to this is to invert the colors instead of highlighting them, this is done in many applications (Foxit Reader comes to mind).