如何查找扫描文档图像上的空白字段

发布于 2024-07-14 01:42:58 字数 882 浏览 6 评论 0原文

我希望我的申请能够填写表格中的一个字段 以黑白图像文件形式存在。 形式总是 开始时是相同的纸质版本,但到了我的 应用程序从我的用户那里获取它,它可能已被扫描或传真更多 比一次。 因此,我需要的字段不在 每个文件中的相同位置。

我的用户并不总是从我这里得到空白表格,所以我不 有能力打印我可以的标记或占位符 以后认得。

原来的空白表格上有文字,但因为可能 已经传真了,我的分辨率只有200 dpi。 文本 总是大到足以让人阅读,但我对此表示怀疑 关于OCR。

我有一些预算,所以我不需要免费的解决方案......让我们 就说2000美元吧。

也就是说,我正在考虑

  1. 获取 OCR 解决方案来查找文本 我需要的字段上的标签。 我不 认为我有资源或 自己的专业知识。 我不 需要完美的认可,因为我 已经知道文字说了什么。 但我确实需要知道 X- 和 Y 坐标。 有软件吗 是这样的吗? 还是编程比我想象的更容易?

  2. 构建或购买软件来识别 表格的边缘。 从那里, 我可以获得的相对位置 我需要的领域。 我在想 我的扫描仪软件在图像周围放置的虚线 一个小文件。 这是一个已知的 算法或者是否有可用的 解决方案?

  3. 其他一些识别的方法 我需要的领域。 尝试谷歌 表格填写软件给我 数百个网络表单匹配项, pdf 表格等不符合我的要求 需要。

我对语言并不挑剔。 我的应用程序在 Linux 上运行,但如果最好的解决方案是 Microsoft,我可能可以实现这一点。

我很感激你的想法。

I want my application to fill in a single field in a form that
exists as an black-and-white image file. The form always
starts as the same paper version, but by the time my
application gets it from my users, it may have been scanned or faxed more
than once. Because of that, the field I need is not in the
same place in every file.

My users do not always get the blank form from me, so I do not
have the ability to print a mark or placeholder that I can
recognize later.

There is text on the original blank form, but because it may
have been faxed, I have only 200 dpi of resolution. The text
is always big enough for a human to read, but I'm skeptical
about OCR.

I have some budget so I do not need a free solution ... let's
just say $2000.

That said, I am considering

  1. Get an OCR solution to find the text
    label on the field I need. I do not
    think I have the resources or
    expertise to roll-my-own. I do not
    need perfect recognition, since I
    already know what the text says.
    But I do need to know X- and
    Y-coordinates. Is there software
    that does this? Or is the programming easier than I think?

  2. Build or buy software to recognize
    the edges of the form. From there,
    I could get the relative position of
    the field I need. I'm thinking of
    the dashed line my scanner software puts around the image of
    a small document. Is that a known
    algorhthm or is there an available
    solution?

  3. Some other way to recognize the
    field I need. Attempts to google
    form filling software give me
    hundreds of matches for web forms,
    pdf forms, etc. that do not do what I
    need.

I'm not picky about language. My application runs on Linux, but if the best solution is Microsoft, I can probably make that work.

I'd appreciate your thoughts.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

风为裳 2024-07-21 01:42:58

如果我理解正确的话,表格总是相同的,但可能会因复印/传真而移动、缩放或稍微旋转。 在这种情况下,您的问题是图像配准之一:找到最佳的刚性变换,使用户的表单与您的“模型”表单对齐,其中您知道字段的位置兴趣。 一旦知道了转换,您就可以计算该字段在用户表单中的位置。

有许多图像配准算法,通常是为对齐大脑 MR 图像等应用而开发的。 它们的计算成本很高并且需要统计先验。 幸运的是,您的情况更简单:您所需要做的就是在用户表单的内容周围放置一个矩形。 坐标下降应该有效。 您需要对噪音(表单外的垃圾)有一定的容忍度。

If I understand correctly, the form is always the same, but may be shifted, scaled, or slightly rotated due to photocopying/faxing. In that case, your problem is one of image registration: find the optimal rigid transformation that makes a form from a user line up with your "model" form, in which you know the location of the field of interest. Once you know the transformation, you can compute the location of the field in the user's form.

There are many image registration algorithms, typically developed for applications such as aligning MR-images of the brain. They are computationally expensive and require statistical priors. Fortunately, your case is easier: all you need to do is fit a rectangle around the contents of the user's form. Coordinate descent should work. You will need some tolerance for noise (junk outside the form).

貪欢 2024-07-21 01:42:58

以下是一些可用 OCR 解决方案(开源和非开源)的小摘要:http://googlesystem.blogspot.com/2007/04/open-source-ocr-software-spoke-by.html

Here's a little summary of some available OCR solutions (open source and not): http://googlesystem.blogspot.com/2007/04/open-source-ocr-software-sponsored-by.html

最佳男配角 2024-07-21 01:42:58

严格的注册可能还不够。 用户可以修改模板表单的布局和格式,例如更改字体、更改复选框或输入框的位置、在不同的换行位置处断开段落等。这些差异处理起来比纯粹的差异更复杂。平移、旋转或缩放变换。 此外,如果您的图像是二值图像(黑白),我认为那些医学图像配准算法(处理灰度图像)不会有太大帮助。 您的成本函数和最小化策略可能会相应改变。

Rigid registration may not be enough. Users may modify the layout and formatting of a template form, such as change the fonts, change the location of a checkbox or an entry box, break a paragraph at different newline positions, etc. These differences are more complicated to deal with than the pure shift, rotation or scale transformation. Besides, if your image is binary image (black and white), I don't think those medical image registration algorithms (working on grayscale image) will help much. Your cost function and minimization strategies may be changed accordingly.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文