用于数学公式的 OCR 库
我需要一个开放的 OCR 库,它能够扫描复杂的打印数学公式(例如通过 LaTeX 生成的一些公式)。我想要得到一些类似 LaTeX 的输出(或者只是一些类似 AST 的数据)。
已经有这样的事情了吗?或者当前的 OCR 技术只能解析面向行的文本?
(请注意,我还在 Metaoptimize 上发布了此问题,因为那里的一些人可能有更多的知识。)
OpenAI 也将这个问题描述为 im2latex 。
I need an open OCR library which is able to scan complex printed math formulas (for example some formulas which were generated via LaTeX). I want to get some LaTeX-like output (or just some AST-like data).
Is there something like this already? Or are current OCR technics just able to parse line-oriented text?
(Note that I also posted this question on Metaoptimize because some people there might have additional knowledge.)
The problem was also described by OpenAI as im2latex.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
SESHAT 是一个用 C++ 编写的开源系统,用于识别手写数学表达式。 SESHAT 是瓦伦西亚大学 PRHLT 研究中心博士论文的一部分。
在线演示:http://cat.prhlt.upv.es/mer/
来源: https://github.com/falvaro/seshat
SESHAT is a open source system written in C++ for recognizing handwritten mathematical expressions. SESHAT was developed as part of a PhD thesis at the PRHLT research center at Universitat Politècnica de València.
An online demo:http://cat.prhlt.upv.es/mer/
The source: https://github.com/falvaro/seshat
根据Metaoptimize 上的答案和关于 Tesseract 邮件列表的讨论,似乎没有是一个开放/免费的解决方案,但可以做到这一点。
似乎能够做到这一点的唯一解决方案(但我无法验证,因为它仅适用于 Windows 并且非免费),就像其他一些人提到的那样, InftyProject。
According to the answers on Metaoptimize and the discussion on the Tesseract mailinglist, there doesn't seem to be an open/free solution yet which can do that.
The only solution which seems to be able to do it (but I cannot verify as it is Windows-only and non-free) is, like a few other people have mentioned, the InftyProject.
InftyReader 是我所知道的唯一一个。它不是免费软件(钱似乎流向了非营利组织 IIRC)。
http://www.sciaccess.net/en/InftyReader/
我不知道为什么PDF不能LaTeX 中的元数据?就像:把 LaTeX 方程放进去!这有那么难吗? (我对 PDF 语法一无所知,但我想这是可以做到的)。
LaTeX 语法是数学符号的唯一经过验证且真实的标准。制作 MathML 和其他东西的人没有考虑到这一点,这似乎是非常愚蠢的。 InftyReader 生成 MathML 或 LaTeX 语法。
如果我想要 HTML(纯),我就会使用 TTH 来读取 LaTeX 语法。就可以了。
ABBYY FineReader(一个很棒的 OCR 程序)声称您可以训练该软件的数学功能,但这非常愚蠢(谁有时间?)
而且 Unicode 有很多数学符号。今天的 OCR 读者无法理解它们,这表明了软件的糟糕状态以及这项活动中的大脑缺陷。
至于“一次一个符号”,TeX 显然对放置符号的位置有规则。他们不能编写知道这些规则的软件?! TeX 甚至是公共领域!他们可以在他们的商业产品中“使用它”。
InftyReader is the only one I'm aware of. It is NOT free software (it seems the money goes to a non-profit org, IIRC).
http://www.sciaccess.net/en/InftyReader/
I don't know why PDF can't have metadata in LaTeX? As in: put the LaTeX equation in it! Is this so hard? (I dunno anything about PDF syntax, but I imagine it can be done).
LaTeX syntax is THE ONE TRIED AND TRUE STANDARD for mathematics notation. It seems amazingly stupid that folks that produced MathML and other stuff don't take this in consideration. InftyReader generates MathML or LaTeX syntax.
If I want HTML (pure) I then use TTH to read the LaTeX syntax. Just works.
ABBYY FineReader (a great OCR program) claims you can train the software for Math, but this is immensely braindead (who has the time?)
And Unicode has lots of math symbols. That today's OCR readers can't grok them shows the sorry state of software and the brain deficit in this activity.
As to "one symbol at a time", TeX obviously has rules as to where it will place symbols. They can't write software that know those rules?! TeX is even public domain! They can just "use it" in their comercial products.
查看“Web 方程”。它可以将手写方程转换为 LaTeX、MathML 或 SymbolTree。我不确定该引擎是否开源。
Check out "Web Equation." It can convert handwritten equations to LaTeX, MathML, or SymbolTree. I'm not sure if the engine is open source.
考虑到当前技术一次读取一个符号(请参阅 http://detexify.kirelabs.org/classify。 html),我怀疑是否有完整数学方程的 OCR。
Considering that current technologies read one symbol at a time (see http://detexify.kirelabs.org/classify.html), I doubt there is an OCR for full mathematical equations.
Infty 效果相当好。我以前的公司将其集成到一个应用程序中,为盲人大声朗读方程式,并得到了用户的良好反馈。
http://www.inftyproject.org/en/download.html
Infty works fairly well. My former company integrated it into an application that reads equations out loud for blind people and is getting good feedback from users.
http://www.inftyproject.org/en/download.html
由于复杂公式的数学 OCR 输出可能存在错误(即使是人类也难以处理),因此您必须校对结果,至少如果它们很重要的话。然后(人类)校对员必须纠正结果,这意味着您需要有一个数学公式编辑器。考虑到人类所需的努力,复杂公式的语料库可能有限,您可能会发现将任务分配给人类更容易。
作为一个研究问题,通过 OCR 阅读数学很有趣 - 您需要二维语法的形式主义以及符号识别器。
除了这里已经提到的参考文献之外,为什么不谷歌一下呢?加州理工学院、罗切斯特分校、滑铁卢大学和加州大学伯克利分校已经完成了这项工作。其中有多少是可以开箱即用的?不知道。
Since the output from math OCR for complex formulas will likely have bugs -- even humans have trouble with it -- you will have to proofread th results, at least if they matter. The (human) proofreader will then have to correct the results, meaning you need to have a math formula editor. Given the effort needed by humans, the probably limited corpus of complex formulas, you might find it easier to assign the task to humans.
As a research problem, reading math via OCR is fun -- you need a formalism for 2-D grammars plus a symbol recognizer.
In addition to references already mentioned here, why not google for this? There is work that was done at Caltech, Rochester, U. Waterloo, and UC Berkeley. How much of it is ready to use out of the box? Dunno.
截至 2019 年 8 月,有几种选择,具体取决于您的需求:
要将打印的数学方程/公式转换为 LaTex,Mathpix 绝对是最佳选择。它是免费的。
要将手写数学转换为 LaTex 或打印数学,MyScript 是最佳选择,尽管其应用程序花费几美元。
As of August 2019, there are a few options, depending on what you need:
For converting printed math equations/formulas to LaTex, Mathpix is absolutely the best choice. It's free.
For converting handwritten math to LaTex or printed math, MyScript is the best option, although its app costs a few dollars.
你知道,Win7 中有一个应用程序专门用于此目的: 数学输入面板。它甚至可以处理手写输入(它实际上是为此而设计的)。如果您有 Win7,请尝试一下,它是免费的!
You know, there's an application in Win7 just for that: Math Input Panel. It even handles handwritten input (it's actually made for this). Give it a shot if you have Win7, it's free!
有这个很棒的短视频:http://www.youtube.com/watch?v=LAJm3J36tLQ< /a>
解释如何训练 Fine Reader 识别数学公式。如果您已经使用 Fine Reader,最好坚持使用一种工具。当然它不是免费软件:(
there is this great short video: http://www.youtube.com/watch?v=LAJm3J36tLQ
explaining how you can train your Fine Reader to recognize math formulas. If you use Fine Reader already, better to stick with one tool. Of course it is not free ware :(