打印 PDF 文件时如何禁用用图形表示替换文本字符?
我有虚拟打印机驱动程序,可以从打印的内容创建 EMF 文件。之后,我的应用程序分析创建的 EMF 文件并提取文本信息。
问题是:如果客户打印 PDF 文件,生成的 EMF 文件中通常会缺少文本信息,因为 PDF 打印软件会用图形表示替换非 ASCII 字符。例如,生成的 EMR_EXTTEXTOUT/EMR_SMALLTEXTOUT 记录之一包含每个打印字符的 EMR_BEGINPATH/EMR_POLYDRAW16/EMR_ENDPATH 序列。所以我无法从这样的 EMF 文件中提取文本信息。
是否可以禁用此行为?
I have virtual printer driver which creates EMF files from printed stuff. After that my application analyses created EMF files and extracts text information.
Here is the problem: often if customer prints PDF file text information is missing in generated EMF file because PDF printing software replaces non-ascii characters with their graphic representation. For example instead one of EMR_EXTTEXTOUT/EMR_SMALLTEXTOUT records generated file contains EMR_BEGINPATH/EMR_POLYDRAW16/EMR_ENDPATH sequence for every printed character. So i am unable to extract text information from such EMF file.
Is it possible to disable this behavior?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您无能为力,此行为是在打印软件或 PDF 文件本身中实现的(PDF 文件可能包含曲线文本,而不是纯文本),而不是在打印驱动程序中实现。
也许打印软件有一个选项可以在文本作为文本打印和文本作为曲线打印之间切换。
There is nothing you can do, this behavior is implemented in the printing software or in the PDF file itself (the PDF file might contain text as curves and not as plain text), not in the print driver.
Perhaps the printing software has an option to switch between text as text printing and text as curves printing.