Azure Language Studio未在Python脚本中显示有关OCR的文本内容
我正在处理Word文档的OCR,以识别文档中提到的内容。我观察到,OCR生成的Python代码没有在语言工作室中自动生成的Python脚本中显示文档中可用的内容。我只想获取Python脚本结构,可以在其中看到专注于识别句子而没有表内容的标签。
我正在寻找的方法是正确的吗?任何解释要求的流程都非常感谢。
I am working on OCR of a word document to recognize the content mentioned in the document. I observed that OCR generated python code is not showing the content available in the document in auto-generated python script in language studio. I just want to get the python script structure where I can see the tags which are focusing on the identifying the sentences without table content.
Is the approach I am looking for is right? Any flow that explains requirement is much appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
此问题将以形式识别器定义。在这个因素中,我们看不到 docx和pdf (例如 python脚本)的文本文件的任何内容。形式识别,与具有语法的预模型场景的特定结构有关。但是在Python 中未识别为没有任何表格形式的输入的内容。该内容将在 json 中可见,而不是表格。
查看线程以获取参考。
使用Python
This problem will be defined in Form Recognizer. In this factor, we cannot see anything related to the general text from an image or a text file like DOCX and PDF in python script which will be generated. The form recognition, related to the specific structure that is having pre-modelled scenarios of syntax. But the content which was mentioned as the input without any tabular form is not recognized in python. The content will be visible in JSON which is other than the tabular form.
Check out the thread for reference.
Azure Cognitive Form Recognizer to Extract Page Numbers using Python