OCR 和汉字符号算法
我有一个朋友正在启动一个新项目。他希望能够使用某种 OCR 来检测汉字符号并将其翻译成其他语言。为此,他在寻找可用算法方面遇到了一些困难,因为这些符号比我们习惯的英文字符要复杂一些。
我们建议他开始研究 2D 卷积和傅里叶变换来开始模式识别过程,但他正在寻找一个好的起点。
不幸的是,我对 OCR 的了解非常有限,因此我可以传递的任何建议可能都会最有帮助!
I have a friend that is starting up a new project. He wants to be able to use some sort of OCR in order to detect and translate Kanji symbols into other languages. He has hit a bit of a brick wall in finding available algorithms in order to do so, since these symbols are a bit more complex than the English characters that we're used to.
We suggested he start looking into 2D convolution and Fourier transforms to start the pattern recognition process, but he is looking for a good starting point.
Unfortunately my knowledge of OCR is extremely limited, so any suggestions that I can pass along will probably be most helpful!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
请查看 nhocr。
(另外,还有 tesseract,但我不确定它们是否实际上支持 CJK。)
SO 上有很多关于 OCR 信息的问题,例如,尝试 此搜索 。
Have a look at nhocr.
(Also, there is tesseract, but I'm not sure if they actually support CJK.)
There are quite a few questions with information about OCR on SO, for instance, try this search.