ABCpdf 5 编码问题(特殊字符)
我正在使用 ABCpdf 版本 5 将一些 html 页面渲染为 PDF。
我基本上使用 HttpServerUtility.Execute()
- 方法来检索 pdf 的 html:
System.IO.StringWriter writer = new System.IO.StringWriter();
server.Execute(requestUrl, writer);
string pageResult = writer.ToString();
WebSupergoo.ABCpdf5.Doc pdfDoc = new WebSupergoo.ABCpdf5.Doc();
pdfDoc.AddImageHtml(pageResult);
response.Buffer = false;
response.ContentType = "application/pdf";
response.AddHeader("Content-Disposition", "attachment;filename=MyPdf_" +
FormatDate(DateTime.Now, "yyyy-MM-dd") + ".pdf");
response.BinaryWrite(pdfDoc.GetData());
现在一些特殊字符,如 Umlaute (äöü) 被替换为空格。有趣的是,并非全部。我发现了什么: 在我的 html 页面中。
`<meta http-equiv="content-type" content="text/xhtml; charset=utf-8" />`
如果我解析它,所有特殊字符都会正确呈现。但在我看来,这就像一个丑陋的黑客行为。
早些时候,我没有使用 HttpServerUtility.Execute()
,但我让 ABCpdf 调用 URL 本身:pdfDoc.AddImageUrl("someUrl");
。在那里我没有这样的编码问题。
我还能尝试什么?
I am using ABCpdf Version 5 in order to render some html-pages into PDFs.
I basically use HttpServerUtility.Execute()
- Method in order to retrieve the html for the pdf:
System.IO.StringWriter writer = new System.IO.StringWriter();
server.Execute(requestUrl, writer);
string pageResult = writer.ToString();
WebSupergoo.ABCpdf5.Doc pdfDoc = new WebSupergoo.ABCpdf5.Doc();
pdfDoc.AddImageHtml(pageResult);
response.Buffer = false;
response.ContentType = "application/pdf";
response.AddHeader("Content-Disposition", "attachment;filename=MyPdf_" +
FormatDate(DateTime.Now, "yyyy-MM-dd") + ".pdf");
response.BinaryWrite(pdfDoc.GetData());
Now some special characters like Umlaute (äöü) are replaced with an empty space. Interestingly not all of them. What I did figure out:
Within the html-page I have.
`<meta http-equiv="content-type" content="text/xhtml; charset=utf-8" />`
If I parse this away, all special chars are rendered correctly. But this seems to me like an ugly hack.
In earlier days I did not use HttpServerUtility.Execute()
, but I let ABCpdf call the URL itself: pdfDoc.AddImageUrl("someUrl");
. There I had no such encoding-problems.
What could I try else?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
刚刚遇到这个问题 ABCpdf 8。
在代码中,您检索 HTML 内容并将 pageResult 传递给 AddImageHtml()。作为文档州,
没有提到的是,临时文件是 UTF-8 编码的,但 HTML 文件中没有说明编码。
<元>标签实际上设置了所需的编码,并解决了我的问题。
避免声明编码的一种方法是使用 我希望通过 AddImageUrl() 方法从 HTTP/HTML 响应中检测 HTML 编码。
Just came across this problem with ABCpdf 8.
In your code you retrieve HTML contents and pass the pageResult to AddImageHtml(). As the documentation states,
What is not mentioned is that the temp file is UTF-8 encoded, but the encoding is not stated in the HTML file.
The <meta> tag actually sets the required encoding, and solved my problem.
One way to avoid the declaration of the encoding is to use the AddImageUrl() method that I expect to detect the HTML encoding from the HTTP/HTML response.
编码元标记和 AddImageURL 方法可能有助于简单的文档,但在链式情况下则无济于事,在这种情况下,尽管编码了标记,但编码还是会以某种方式丢失。我遇到了这个问题(正如原始问题中所描述的那样 - 一些外来字符(例如变音符号)会消失),并且没有看到解决方案。我正在考虑完全摆脱 ABCPDF 并将其替换为 SSRS,它可以呈现 PDF 格式。
Encoding meta tag and AddImageURL method perhaps helps with simple document, but not in a chain situation, where encoding somehow gets lost despite encoding tag. I encountered this problem (exactly as described in original question - some foreign characters such as umlauts would disappear), and see no solution. I am considering getting rid of ABCPDF altogether and replace it with SSRS, which can render PDF formats.