如何使用 WkHTMLToSharp / wkhtmltopdf 和 C# 中的图像将 HTML 文件转换为 PDF

发布于 2024-11-25 00:00:26 字数 3422 浏览 1 评论 0原文

我正在即时生成 HTML 文件,并且我想从最终文件创建 PDF。我使用以下代码生成 HTML 文件:

    public static void WriteHTML(string cFile, List<Movie> mList)
    {
        int lineID = 0;
        string strHeader, strMovie, strGenre, tmpGenre = null;

        string strPDF = null;

        // initiates streamwriter for catalog output file
        FileStream fs = new FileStream(cFile, FileMode.Create);
        StreamWriter catalog = new StreamWriter(fs);

        strHeader = "<style type=\"text/css\">\r\n" + "<!--\r\n" + "tr#odd {\r\n" + "   background-color:#e2e2e2;\r\n" + "  vertical-align:top;\r\n" + "}\r\n" + "\r\n" + "tr#even {\r\n" + "   vertical-align:top;\r\n" + "}\r\n" + "div#title {\r\n" + "  font-size:16px;\r\n" + "    font-weight:bold;\r\n" + "}\r\n" + "\r\n" + "div#mpaa {\r\n" + "    font-size:10px;\r\n" + "}\r\n" + "\r\n" + "div#genre {\r\n" + " font-size:12px;\r\n" + "    font-style:italic;\r\n" + "}\r\n" + "\r\n" + "div#plot {\r\n" + "   height: 63px;\r\n" + "  font-size:12px;\r\n" + "    overflow:hidden;\r\n" + "}\r\n" + "-->\r\n" + "</style>\r\n" + "\r\n" + "<html>\r\n" + "    <body>\r\n" + "     <table>\r\n";
        catalog.WriteLine(strHeader);
        strPDF = strHeader;

        foreach (Movie m in mList)
        {
            tmpGenre = null;

            strMovie = lineID == 0 ? "          <tr id=\"odd\" style=\"page-break-inside:avoid\">\r\n" : "          <tr id=\"even\" style=\"page-break-inside:avoid\">\r\n";
            catalog.WriteLine(strMovie);
            strPDF += strMovie;

            foreach (string genre in m.Genres)
                tmpGenre += ", <a href=\"" + genre + ".html\" target=\"_blank\">" + genre + "</a>";
            strGenre = tmpGenre != null ? tmpGenre.Substring(2) : null;

            strMovie = "                <td>\r\n" + "                   <img src=\".\\images\\" + m.ImageFile + "\" width=\"75\" height=\"110\">\r\n" + "               </td>\r\n" + "              <td>\r\n" + "                   <div id=\"title\">" + m.Title + "</div>\r\n" + "                    <div id=\"mpaa\">" + m.Certification + " " + m.MPAA + "</div>\r\n" + "                  <div id=\"genre\">" + strGenre + "</div>\r\n" + "                   <div id=\"plot\">" + m.Plot + "</div>\r\n" + "              </td>\r\n" + "          </tr>\r\n";
            catalog.WriteLine(strMovie);
            strPDF += strMovie;
            lineID = lineID == 0 ? 1 : 0;
        }

        string closingHTML = "      </table>\r\n" + "   </body>\r\n" + "</html>";
        catalog.WriteLine(closingHTML);
        strPDF += closingHTML;
        WritePDF(strPDF, cFile + ".PDF");
        catalog.Close();
    }

完成后,我想调用以下函数来生成 PDF 文件:

public static void WritePDF(string cFile, string pdfFile)
{
    WkHtmlToPdfConverter w = new WkHtmlToPdfConverter();

    byte[] strHTML = w.Convert(cFile);
    File.WriteAllBytes(pdfFile, strHTML);
    w.Dispose();
}

我发现 .Convert 函数会将 HTML 代码转换为 PDF,而不是文件。其次,当我直接传递 HTML 代码时,图像不会出现在 PDF 中。我知道 .GIF 文件存在问题,但这些都是 .JPG 文件。

我读过很多关于 wkhtmltopdf 有多好的文章,并且编写 WkHTMLToSharp 的人在 SO 上发布了他的项目,但我对缺乏相关文档感到失望。

我希望能够传入要转换的文件,更改边距(我知道这是可能的,我只需要找出正确的设置),让它正确转换图像,最重要的是,不要破坏我的项目跨多个页面(支持“page-break-inside:avoid”或类似的内容)。

我很想看看其他人如何使用它!

I am generating HTML files on the fly, and I would like to create a PDF from the final file. I am using the following to generate the HTML file:

    public static void WriteHTML(string cFile, List<Movie> mList)
    {
        int lineID = 0;
        string strHeader, strMovie, strGenre, tmpGenre = null;

        string strPDF = null;

        // initiates streamwriter for catalog output file
        FileStream fs = new FileStream(cFile, FileMode.Create);
        StreamWriter catalog = new StreamWriter(fs);

        strHeader = "<style type=\"text/css\">\r\n" + "<!--\r\n" + "tr#odd {\r\n" + "   background-color:#e2e2e2;\r\n" + "  vertical-align:top;\r\n" + "}\r\n" + "\r\n" + "tr#even {\r\n" + "   vertical-align:top;\r\n" + "}\r\n" + "div#title {\r\n" + "  font-size:16px;\r\n" + "    font-weight:bold;\r\n" + "}\r\n" + "\r\n" + "div#mpaa {\r\n" + "    font-size:10px;\r\n" + "}\r\n" + "\r\n" + "div#genre {\r\n" + " font-size:12px;\r\n" + "    font-style:italic;\r\n" + "}\r\n" + "\r\n" + "div#plot {\r\n" + "   height: 63px;\r\n" + "  font-size:12px;\r\n" + "    overflow:hidden;\r\n" + "}\r\n" + "-->\r\n" + "</style>\r\n" + "\r\n" + "<html>\r\n" + "    <body>\r\n" + "     <table>\r\n";
        catalog.WriteLine(strHeader);
        strPDF = strHeader;

        foreach (Movie m in mList)
        {
            tmpGenre = null;

            strMovie = lineID == 0 ? "          <tr id=\"odd\" style=\"page-break-inside:avoid\">\r\n" : "          <tr id=\"even\" style=\"page-break-inside:avoid\">\r\n";
            catalog.WriteLine(strMovie);
            strPDF += strMovie;

            foreach (string genre in m.Genres)
                tmpGenre += ", <a href=\"" + genre + ".html\" target=\"_blank\">" + genre + "</a>";
            strGenre = tmpGenre != null ? tmpGenre.Substring(2) : null;

            strMovie = "                <td>\r\n" + "                   <img src=\".\\images\\" + m.ImageFile + "\" width=\"75\" height=\"110\">\r\n" + "               </td>\r\n" + "              <td>\r\n" + "                   <div id=\"title\">" + m.Title + "</div>\r\n" + "                    <div id=\"mpaa\">" + m.Certification + " " + m.MPAA + "</div>\r\n" + "                  <div id=\"genre\">" + strGenre + "</div>\r\n" + "                   <div id=\"plot\">" + m.Plot + "</div>\r\n" + "              </td>\r\n" + "          </tr>\r\n";
            catalog.WriteLine(strMovie);
            strPDF += strMovie;
            lineID = lineID == 0 ? 1 : 0;
        }

        string closingHTML = "      </table>\r\n" + "   </body>\r\n" + "</html>";
        catalog.WriteLine(closingHTML);
        strPDF += closingHTML;
        WritePDF(strPDF, cFile + ".PDF");
        catalog.Close();
    }

Once completed, I want to call the following function to generate the PDF file:

public static void WritePDF(string cFile, string pdfFile)
{
    WkHtmlToPdfConverter w = new WkHtmlToPdfConverter();

    byte[] strHTML = w.Convert(cFile);
    File.WriteAllBytes(pdfFile, strHTML);
    w.Dispose();
}

I've discovered that the .Convert function will convert HTML code to PDF, not a file. Secondly, when I pass in the HTML code directly, the images are not appearing in the PDF. I know there is an issue with .GIF files, but these are all .JPG files.

I've read a lot about how good wkhtmltopdf is, and the guy who wrote WkHTMLToSharp posted his project all over SO, but I've been disappointed by the lack of documentation for it.

I WANT to be able to pass in a file to convert, change the margins (I know this is possible, I just need to figure out the correct settings), have it convert images correctly, and most importantly, to not break up my items across multiple pages (support "page-break-inside:avoid" or something similar).

I'd love to see how others are using this!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

琉璃梦幻 2024-12-02 00:00:26

我编写了一个关于如何从 HTML 创建 PDF 的示例。我刚刚更新它也可以打印图像。

https://github.com/hmadrigal/playground-dotnet/tree/ master/MsDotNet.PdfGeneration

(在我的博客文章中,我解释了该项目的大部分内容https://hmadrigal.wordpress.com/2015/10/16/creating-pdf-reports-from-html-using-dotliquid-markup-for-templates-and-wkhtmltoxsharp-for-printing-pdf/< /a> )

几乎有两个选择:

1:使用 file:// 和文件的完整路径。

<img alt="profile" src="{{ employee.PorfileFileName | Prepend: "Assets\ProfileImage\" | ToLocalPath  }}" />

2:使用 URL 数据 (https://en.wikipedia.org/wiki/Data_URI_scheme)

<img alt="profile" src="data:image/png;base64,{{ employee.PorfileFileName | Prepend: "Assets\ProfileImage\" | ToLocalPath | ToBase64 }}" />

干杯,
草本植物

I have coded an example about how to create a PDF from HTML. I just updated it to also print images.

https://github.com/hmadrigal/playground-dotnet/tree/master/MsDotNet.PdfGeneration

(In my blog post I explain most of the project https://hmadrigal.wordpress.com/2015/10/16/creating-pdf-reports-from-html-using-dotliquid-markup-for-templates-and-wkhtmltoxsharp-for-printing-pdf/ )

Pretty much you have two options:

1: Using file:// and the fullpath to the file.

<img alt="profile" src="{{ employee.PorfileFileName | Prepend: "Assets\ProfileImage\" | ToLocalPath  }}" />

2: Using URL Data (https://en.wikipedia.org/wiki/Data_URI_scheme)

<img alt="profile" src="data:image/png;base64,{{ employee.PorfileFileName | Prepend: "Assets\ProfileImage\" | ToLocalPath | ToBase64 }}" />

Cheers,
Herb

┊风居住的梦幻卍 2024-12-02 00:00:26

使用 WkHtmlToXSharp。

从 Github 下载最新的 DLL

public static string ConvertHTMLtoPDF(string htmlFullPath, string pageSize, string orientation)
{
   string pdfUrl = htmlFullPath.Replace(".html", ".pdf");

   try
   {
       #region USING WkHtmlToXSharp.dll
       //IHtmlToPdfConverter converter = new WkHtmlToPdfConverter();
       IHtmlToPdfConverter converter = new MultiplexingConverter();

       converter.GlobalSettings.Margin.Top = "0cm";
       converter.GlobalSettings.Margin.Bottom = "0cm";
       converter.GlobalSettings.Margin.Left = "0cm";
       converter.GlobalSettings.Margin.Right = "0cm";
       converter.GlobalSettings.Orientation = (PdfOrientation)Enum.Parse(typeof(PdfOrientation), orientation);
       if (!string.IsNullOrEmpty(pageSize))
           converter.GlobalSettings.Size.PageSize = (PdfPageSize)Enum.Parse(typeof(PdfPageSize), pageSize);

       converter.ObjectSettings.Page = htmlFullPath;
       converter.ObjectSettings.Web.EnablePlugins = true;
       converter.ObjectSettings.Web.EnableJavascript = true;
       converter.ObjectSettings.Web.Background = true;
       converter.ObjectSettings.Web.LoadImages = true;
       converter.ObjectSettings.Load.LoadErrorHandling = LoadErrorHandlingType.ignore;

       Byte[] bufferPDF = converter.Convert();

       System.IO.File.WriteAllBytes(pdfUrl, bufferPDF);

       converter.Dispose();

       #endregion
   }
   catch (Exception ex)
   {
       throw new Exception(ex.Message, ex);
   }

   return pdfUrl;
}

Use WkHtmlToXSharp.

Download the latest DLL from Github

public static string ConvertHTMLtoPDF(string htmlFullPath, string pageSize, string orientation)
{
   string pdfUrl = htmlFullPath.Replace(".html", ".pdf");

   try
   {
       #region USING WkHtmlToXSharp.dll
       //IHtmlToPdfConverter converter = new WkHtmlToPdfConverter();
       IHtmlToPdfConverter converter = new MultiplexingConverter();

       converter.GlobalSettings.Margin.Top = "0cm";
       converter.GlobalSettings.Margin.Bottom = "0cm";
       converter.GlobalSettings.Margin.Left = "0cm";
       converter.GlobalSettings.Margin.Right = "0cm";
       converter.GlobalSettings.Orientation = (PdfOrientation)Enum.Parse(typeof(PdfOrientation), orientation);
       if (!string.IsNullOrEmpty(pageSize))
           converter.GlobalSettings.Size.PageSize = (PdfPageSize)Enum.Parse(typeof(PdfPageSize), pageSize);

       converter.ObjectSettings.Page = htmlFullPath;
       converter.ObjectSettings.Web.EnablePlugins = true;
       converter.ObjectSettings.Web.EnableJavascript = true;
       converter.ObjectSettings.Web.Background = true;
       converter.ObjectSettings.Web.LoadImages = true;
       converter.ObjectSettings.Load.LoadErrorHandling = LoadErrorHandlingType.ignore;

       Byte[] bufferPDF = converter.Convert();

       System.IO.File.WriteAllBytes(pdfUrl, bufferPDF);

       converter.Dispose();

       #endregion
   }
   catch (Exception ex)
   {
       throw new Exception(ex.Message, ex);
   }

   return pdfUrl;
}
落花随流水 2024-12-02 00:00:26

您可以使用 Spire.Pdf 来执行此操作。

该组件可以将 html 转换为 pdf。

 PdfDocument pdfdoc = new PdfDocument();
 pdfdoc.LoadFromHTML(fileFullName, true, true, true);
 //String url = "http://www.e-iceblue.com/";
 //pdfdoc.LoadFromHTML(url, false, true, true);
 pdfdoc.SaveToFile("FromHTML.pdf");

You can use Spire.Pdf to do so.

This component could convert html to pdf.

 PdfDocument pdfdoc = new PdfDocument();
 pdfdoc.LoadFromHTML(fileFullName, true, true, true);
 //String url = "http://www.e-iceblue.com/";
 //pdfdoc.LoadFromHTML(url, false, true, true);
 pdfdoc.SaveToFile("FromHTML.pdf");
祁梦 2024-12-02 00:00:26

我们还使用 wkhtmltopdf 并且能够正确渲染图像。但是,默认情况下禁用图像渲染。

您必须在转换器实例上指定这些选项:

var wk = _GetConverter()
wk.GlobalSettings.Margin.Top = "20mm";
wk.GlobalSettings.Margin.Bottom = "10mm";
wk.GlobalSettings.Margin.Left = "10mm";
wk.GlobalSettings.Margin.Right = "10mm";
wk.GlobalSettings.Size.PaperSize = PdfPaperSize.A4;
wk.ObjectSettings.Web.PrintMediaType = true;
wk.ObjectSettings.Web.LoadImages = true;
wk.ObjectSettings.Web.EnablePlugins = false;
wk.ObjectSettings.Web.EnableJavascript = true;

result = wk.Convert(htmlContent);

We're also using wkhtmltopdf and are able to render images correctly. However, by default the rendering of images is disabled.

You have to specify those options on your converter instance:

var wk = _GetConverter()
wk.GlobalSettings.Margin.Top = "20mm";
wk.GlobalSettings.Margin.Bottom = "10mm";
wk.GlobalSettings.Margin.Left = "10mm";
wk.GlobalSettings.Margin.Right = "10mm";
wk.GlobalSettings.Size.PaperSize = PdfPaperSize.A4;
wk.ObjectSettings.Web.PrintMediaType = true;
wk.ObjectSettings.Web.LoadImages = true;
wk.ObjectSettings.Web.EnablePlugins = false;
wk.ObjectSettings.Web.EnableJavascript = true;

result = wk.Convert(htmlContent);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文