如何使用 WkHTMLToSharp / wkhtmltopdf 和 C# 中的图像将 HTML 文件转换为 PDF
我正在即时生成 HTML 文件,并且我想从最终文件创建 PDF。我使用以下代码生成 HTML 文件:
public static void WriteHTML(string cFile, List<Movie> mList)
{
int lineID = 0;
string strHeader, strMovie, strGenre, tmpGenre = null;
string strPDF = null;
// initiates streamwriter for catalog output file
FileStream fs = new FileStream(cFile, FileMode.Create);
StreamWriter catalog = new StreamWriter(fs);
strHeader = "<style type=\"text/css\">\r\n" + "<!--\r\n" + "tr#odd {\r\n" + " background-color:#e2e2e2;\r\n" + " vertical-align:top;\r\n" + "}\r\n" + "\r\n" + "tr#even {\r\n" + " vertical-align:top;\r\n" + "}\r\n" + "div#title {\r\n" + " font-size:16px;\r\n" + " font-weight:bold;\r\n" + "}\r\n" + "\r\n" + "div#mpaa {\r\n" + " font-size:10px;\r\n" + "}\r\n" + "\r\n" + "div#genre {\r\n" + " font-size:12px;\r\n" + " font-style:italic;\r\n" + "}\r\n" + "\r\n" + "div#plot {\r\n" + " height: 63px;\r\n" + " font-size:12px;\r\n" + " overflow:hidden;\r\n" + "}\r\n" + "-->\r\n" + "</style>\r\n" + "\r\n" + "<html>\r\n" + " <body>\r\n" + " <table>\r\n";
catalog.WriteLine(strHeader);
strPDF = strHeader;
foreach (Movie m in mList)
{
tmpGenre = null;
strMovie = lineID == 0 ? " <tr id=\"odd\" style=\"page-break-inside:avoid\">\r\n" : " <tr id=\"even\" style=\"page-break-inside:avoid\">\r\n";
catalog.WriteLine(strMovie);
strPDF += strMovie;
foreach (string genre in m.Genres)
tmpGenre += ", <a href=\"" + genre + ".html\" target=\"_blank\">" + genre + "</a>";
strGenre = tmpGenre != null ? tmpGenre.Substring(2) : null;
strMovie = " <td>\r\n" + " <img src=\".\\images\\" + m.ImageFile + "\" width=\"75\" height=\"110\">\r\n" + " </td>\r\n" + " <td>\r\n" + " <div id=\"title\">" + m.Title + "</div>\r\n" + " <div id=\"mpaa\">" + m.Certification + " " + m.MPAA + "</div>\r\n" + " <div id=\"genre\">" + strGenre + "</div>\r\n" + " <div id=\"plot\">" + m.Plot + "</div>\r\n" + " </td>\r\n" + " </tr>\r\n";
catalog.WriteLine(strMovie);
strPDF += strMovie;
lineID = lineID == 0 ? 1 : 0;
}
string closingHTML = " </table>\r\n" + " </body>\r\n" + "</html>";
catalog.WriteLine(closingHTML);
strPDF += closingHTML;
WritePDF(strPDF, cFile + ".PDF");
catalog.Close();
}
完成后,我想调用以下函数来生成 PDF 文件:
public static void WritePDF(string cFile, string pdfFile)
{
WkHtmlToPdfConverter w = new WkHtmlToPdfConverter();
byte[] strHTML = w.Convert(cFile);
File.WriteAllBytes(pdfFile, strHTML);
w.Dispose();
}
我发现 .Convert 函数会将 HTML 代码转换为 PDF,而不是文件。其次,当我直接传递 HTML 代码时,图像不会出现在 PDF 中。我知道 .GIF 文件存在问题,但这些都是 .JPG 文件。
我读过很多关于 wkhtmltopdf 有多好的文章,并且编写 WkHTMLToSharp 的人在 SO 上发布了他的项目,但我对缺乏相关文档感到失望。
我希望能够传入要转换的文件,更改边距(我知道这是可能的,我只需要找出正确的设置),让它正确转换图像,最重要的是,不要破坏我的项目跨多个页面(支持“page-break-inside:avoid”或类似的内容)。
我很想看看其他人如何使用它!
I am generating HTML files on the fly, and I would like to create a PDF from the final file. I am using the following to generate the HTML file:
public static void WriteHTML(string cFile, List<Movie> mList)
{
int lineID = 0;
string strHeader, strMovie, strGenre, tmpGenre = null;
string strPDF = null;
// initiates streamwriter for catalog output file
FileStream fs = new FileStream(cFile, FileMode.Create);
StreamWriter catalog = new StreamWriter(fs);
strHeader = "<style type=\"text/css\">\r\n" + "<!--\r\n" + "tr#odd {\r\n" + " background-color:#e2e2e2;\r\n" + " vertical-align:top;\r\n" + "}\r\n" + "\r\n" + "tr#even {\r\n" + " vertical-align:top;\r\n" + "}\r\n" + "div#title {\r\n" + " font-size:16px;\r\n" + " font-weight:bold;\r\n" + "}\r\n" + "\r\n" + "div#mpaa {\r\n" + " font-size:10px;\r\n" + "}\r\n" + "\r\n" + "div#genre {\r\n" + " font-size:12px;\r\n" + " font-style:italic;\r\n" + "}\r\n" + "\r\n" + "div#plot {\r\n" + " height: 63px;\r\n" + " font-size:12px;\r\n" + " overflow:hidden;\r\n" + "}\r\n" + "-->\r\n" + "</style>\r\n" + "\r\n" + "<html>\r\n" + " <body>\r\n" + " <table>\r\n";
catalog.WriteLine(strHeader);
strPDF = strHeader;
foreach (Movie m in mList)
{
tmpGenre = null;
strMovie = lineID == 0 ? " <tr id=\"odd\" style=\"page-break-inside:avoid\">\r\n" : " <tr id=\"even\" style=\"page-break-inside:avoid\">\r\n";
catalog.WriteLine(strMovie);
strPDF += strMovie;
foreach (string genre in m.Genres)
tmpGenre += ", <a href=\"" + genre + ".html\" target=\"_blank\">" + genre + "</a>";
strGenre = tmpGenre != null ? tmpGenre.Substring(2) : null;
strMovie = " <td>\r\n" + " <img src=\".\\images\\" + m.ImageFile + "\" width=\"75\" height=\"110\">\r\n" + " </td>\r\n" + " <td>\r\n" + " <div id=\"title\">" + m.Title + "</div>\r\n" + " <div id=\"mpaa\">" + m.Certification + " " + m.MPAA + "</div>\r\n" + " <div id=\"genre\">" + strGenre + "</div>\r\n" + " <div id=\"plot\">" + m.Plot + "</div>\r\n" + " </td>\r\n" + " </tr>\r\n";
catalog.WriteLine(strMovie);
strPDF += strMovie;
lineID = lineID == 0 ? 1 : 0;
}
string closingHTML = " </table>\r\n" + " </body>\r\n" + "</html>";
catalog.WriteLine(closingHTML);
strPDF += closingHTML;
WritePDF(strPDF, cFile + ".PDF");
catalog.Close();
}
Once completed, I want to call the following function to generate the PDF file:
public static void WritePDF(string cFile, string pdfFile)
{
WkHtmlToPdfConverter w = new WkHtmlToPdfConverter();
byte[] strHTML = w.Convert(cFile);
File.WriteAllBytes(pdfFile, strHTML);
w.Dispose();
}
I've discovered that the .Convert function will convert HTML code to PDF, not a file. Secondly, when I pass in the HTML code directly, the images are not appearing in the PDF. I know there is an issue with .GIF files, but these are all .JPG files.
I've read a lot about how good wkhtmltopdf is, and the guy who wrote WkHTMLToSharp posted his project all over SO, but I've been disappointed by the lack of documentation for it.
I WANT to be able to pass in a file to convert, change the margins (I know this is possible, I just need to figure out the correct settings), have it convert images correctly, and most importantly, to not break up my items across multiple pages (support "page-break-inside:avoid" or something similar).
I'd love to see how others are using this!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我编写了一个关于如何从 HTML 创建 PDF 的示例。我刚刚更新它也可以打印图像。
https://github.com/hmadrigal/playground-dotnet/tree/ master/MsDotNet.PdfGeneration
(在我的博客文章中,我解释了该项目的大部分内容https://hmadrigal.wordpress.com/2015/10/16/creating-pdf-reports-from-html-using-dotliquid-markup-for-templates-and-wkhtmltoxsharp-for-printing-pdf/< /a> )
几乎有两个选择:
1:使用 file:// 和文件的完整路径。
2:使用 URL 数据 (https://en.wikipedia.org/wiki/Data_URI_scheme)
干杯,
草本植物
I have coded an example about how to create a PDF from HTML. I just updated it to also print images.
https://github.com/hmadrigal/playground-dotnet/tree/master/MsDotNet.PdfGeneration
(In my blog post I explain most of the project https://hmadrigal.wordpress.com/2015/10/16/creating-pdf-reports-from-html-using-dotliquid-markup-for-templates-and-wkhtmltoxsharp-for-printing-pdf/ )
Pretty much you have two options:
1: Using file:// and the fullpath to the file.
2: Using URL Data (https://en.wikipedia.org/wiki/Data_URI_scheme)
Cheers,
Herb
使用 WkHtmlToXSharp。
从 Github 下载最新的 DLL
Use WkHtmlToXSharp.
Download the latest DLL from Github
您可以使用 Spire.Pdf 来执行此操作。
该组件可以将 html 转换为 pdf。
You can use Spire.Pdf to do so.
This component could convert html to pdf.
我们还使用 wkhtmltopdf 并且能够正确渲染图像。但是,默认情况下禁用图像渲染。
您必须在转换器实例上指定这些选项:
We're also using wkhtmltopdf and are able to render images correctly. However, by default the rendering of images is disabled.
You have to specify those options on your converter instance: