调用wkhtmltopdf从HTML生成PDF

发布于 2024-08-02 10:42:21 字数 583 浏览 7 评论 0原文

我正在尝试从 HTML 文件创建 PDF 文件。环顾四周后,我发现: wkhtmltopdf 非常完美。我需要从 ASP.NET 服务器调用此 .exe。我尝试过:

    Process p = new Process();
    p.StartInfo.UseShellExecute = false;
    p.StartInfo.FileName = HttpContext.Current.Server.MapPath("wkhtmltopdf.exe");
    p.StartInfo.Arguments = "TestPDF.htm TestPDF.pdf";
    p.Start();
    p.WaitForExit();

在服务器上创建任何文件都没有成功。谁能给我指出正确的方向?我将 wkhtmltopdf.exe 文件放在站点的顶级目录中。还有其他地方应该举行吗?


编辑:如果有人有更好的解决方案从 html 动态创建 pdf 文件,请告诉我。

I'm attempting to create a PDF file from an HTML file. After looking around a little I've found: wkhtmltopdf to be perfect. I need to call this .exe from the ASP.NET server. I've attempted:

    Process p = new Process();
    p.StartInfo.UseShellExecute = false;
    p.StartInfo.FileName = HttpContext.Current.Server.MapPath("wkhtmltopdf.exe");
    p.StartInfo.Arguments = "TestPDF.htm TestPDF.pdf";
    p.Start();
    p.WaitForExit();

With no success of any files being created on the server. Can anyone give me a pointer in the right direction? I put the wkhtmltopdf.exe file at the top level directory of the site. Is there anywhere else it should be held?


Edit: If anyone has better solutions to dynamically create pdf files from html, please let me know.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

〗斷ホ乔殘χμё〖 2024-08-09 10:42:22

如果正确且正确地创建了 pdf 文件,通常会返回代码 =0。如果未创建,则该值在 -ve 范围内。

Generally return code =0 is coming if the pdf file is created properly and correctly.If it's not created then the value is in -ve range.

っ〆星空下的拥抱 2024-08-09 10:42:22
using System;
using System.Diagnostics;
using System.Web;

public partial class pdftest : System.Web.UI.Page
{
    protected void Page_Load(object sender, EventArgs e)
    {

    }
    private void fn_test()
    {
        try
        {
            string url = HttpContext.Current.Request.Url.AbsoluteUri;
            Response.Write(url);
            ProcessStartInfo startInfo = new ProcessStartInfo();
            startInfo.FileName = 
                @"C:\PROGRA~1\WKHTML~1\wkhtmltopdf.exe";//"wkhtmltopdf.exe";
            startInfo.Arguments = url + @" C:\test"
                 + Guid.NewGuid().ToString() + ".pdf";
            Process.Start(startInfo);
        }
        catch (Exception ex)
        {
            string xx = ex.Message.ToString();
            Response.Write("<br>" + xx);
        }
    }
    protected void btn_test_Click(object sender, EventArgs e)
    {
        fn_test();
    }
}
using System;
using System.Diagnostics;
using System.Web;

public partial class pdftest : System.Web.UI.Page
{
    protected void Page_Load(object sender, EventArgs e)
    {

    }
    private void fn_test()
    {
        try
        {
            string url = HttpContext.Current.Request.Url.AbsoluteUri;
            Response.Write(url);
            ProcessStartInfo startInfo = new ProcessStartInfo();
            startInfo.FileName = 
                @"C:\PROGRA~1\WKHTML~1\wkhtmltopdf.exe";//"wkhtmltopdf.exe";
            startInfo.Arguments = url + @" C:\test"
                 + Guid.NewGuid().ToString() + ".pdf";
            Process.Start(startInfo);
        }
        catch (Exception ex)
        {
            string xx = ex.Message.ToString();
            Response.Write("<br>" + xx);
        }
    }
    protected void btn_test_Click(object sender, EventArgs e)
    {
        fn_test();
    }
}
霓裳挽歌倾城醉 2024-08-09 10:42:21

更新:
我的回答如下,在磁盘上创建 pdf 文件。然后,我将该文件作为下载流式传输到用户浏览器。考虑使用像下面 Hath 的答案这样的东西来让 wkhtml2pdf 输出到流,然后直接发送给用户 - 这将绕过文件权限等方面的许多问题。

我原来的答案:
确保您已指定 PDF 的输出路径,该路径可由服务器上运行的 IIS 的 ASP.NET 进程写入(我认为通常是 NETWORK_SERVICE)。

我的看起来像这样(并且它有效):

/// <summary>
/// Convert Html page at a given URL to a PDF file using open-source tool wkhtml2pdf
/// </summary>
/// <param name="Url"></param>
/// <param name="outputFilename"></param>
/// <returns></returns>
public static bool HtmlToPdf(string Url, string outputFilename)
{
    // assemble destination PDF file name
    string filename = ConfigurationManager.AppSettings["ExportFilePath"] + "\\" + outputFilename + ".pdf";

    // get proj no for header
    Project project = new Project(int.Parse(outputFilename));

    var p = new System.Diagnostics.Process();
    p.StartInfo.FileName = ConfigurationManager.AppSettings["HtmlToPdfExePath"];

    string switches = "--print-media-type ";
    switches += "--margin-top 4mm --margin-bottom 4mm --margin-right 0mm --margin-left 0mm ";
    switches += "--page-size A4 ";
    switches += "--no-background ";
    switches += "--redirect-delay 100";

    p.StartInfo.Arguments = switches + " " + Url + " " + filename;

    p.StartInfo.UseShellExecute = false; // needs to be false in order to redirect output
    p.StartInfo.RedirectStandardOutput = true;
    p.StartInfo.RedirectStandardError = true;
    p.StartInfo.RedirectStandardInput = true; // redirect all 3, as it should be all 3 or none
    p.StartInfo.WorkingDirectory = StripFilenameFromFullPath(p.StartInfo.FileName);

    p.Start();

    // read the output here...
    string output = p.StandardOutput.ReadToEnd(); 

    // ...then wait n milliseconds for exit (as after exit, it can't read the output)
    p.WaitForExit(60000); 

    // read the exit code, close process
    int returnCode = p.ExitCode;
    p.Close(); 

    // if 0 or 2, it worked (not sure about other values, I want a better way to confirm this)
    return (returnCode == 0 || returnCode == 2);
}

Update:
My answer below, creates the pdf file on the disk. I then streamed that file to the users browser as a download. Consider using something like Hath's answer below to get wkhtml2pdf to output to a stream instead and then send that directly to the user - that will bypass lots of issues with file permissions etc.

My original answer:
Make sure you've specified an output path for the PDF that is writeable by the ASP.NET process of IIS running on your server (usually NETWORK_SERVICE I think).

Mine looks like this (and it works):

/// <summary>
/// Convert Html page at a given URL to a PDF file using open-source tool wkhtml2pdf
/// </summary>
/// <param name="Url"></param>
/// <param name="outputFilename"></param>
/// <returns></returns>
public static bool HtmlToPdf(string Url, string outputFilename)
{
    // assemble destination PDF file name
    string filename = ConfigurationManager.AppSettings["ExportFilePath"] + "\\" + outputFilename + ".pdf";

    // get proj no for header
    Project project = new Project(int.Parse(outputFilename));

    var p = new System.Diagnostics.Process();
    p.StartInfo.FileName = ConfigurationManager.AppSettings["HtmlToPdfExePath"];

    string switches = "--print-media-type ";
    switches += "--margin-top 4mm --margin-bottom 4mm --margin-right 0mm --margin-left 0mm ";
    switches += "--page-size A4 ";
    switches += "--no-background ";
    switches += "--redirect-delay 100";

    p.StartInfo.Arguments = switches + " " + Url + " " + filename;

    p.StartInfo.UseShellExecute = false; // needs to be false in order to redirect output
    p.StartInfo.RedirectStandardOutput = true;
    p.StartInfo.RedirectStandardError = true;
    p.StartInfo.RedirectStandardInput = true; // redirect all 3, as it should be all 3 or none
    p.StartInfo.WorkingDirectory = StripFilenameFromFullPath(p.StartInfo.FileName);

    p.Start();

    // read the output here...
    string output = p.StandardOutput.ReadToEnd(); 

    // ...then wait n milliseconds for exit (as after exit, it can't read the output)
    p.WaitForExit(60000); 

    // read the exit code, close process
    int returnCode = p.ExitCode;
    p.Close(); 

    // if 0 or 2, it worked (not sure about other values, I want a better way to confirm this)
    return (returnCode == 0 || returnCode == 2);
}
第几種人 2024-08-09 10:42:21

当我尝试将 msmq 与 Windows 服务一起使用时,我遇到了同样的问题,但由于某种原因它非常慢。 (过程部分)。

这就是最终奏效的方法:

private void DoDownload()
{
    var url = Request.Url.GetLeftPart(UriPartial.Authority) + "/CPCDownload.aspx?IsPDF=False?UserID=" + this.CurrentUser.UserID.ToString();
    var file = WKHtmlToPdf(url);
    if (file != null)
    {
        Response.ContentType = "Application/pdf";
        Response.BinaryWrite(file);
        Response.End();
    }
}

public byte[] WKHtmlToPdf(string url)
{
    var fileName = " - ";
    var wkhtmlDir = "C:\\Program Files\\wkhtmltopdf\\";
    var wkhtml = "C:\\Program Files\\wkhtmltopdf\\wkhtmltopdf.exe";
    var p = new Process();

    p.StartInfo.CreateNoWindow = true;
    p.StartInfo.RedirectStandardOutput = true;
    p.StartInfo.RedirectStandardError = true;
    p.StartInfo.RedirectStandardInput = true;
    p.StartInfo.UseShellExecute = false;
    p.StartInfo.FileName = wkhtml;
    p.StartInfo.WorkingDirectory = wkhtmlDir;

    string switches = "";
    switches += "--print-media-type ";
    switches += "--margin-top 10mm --margin-bottom 10mm --margin-right 10mm --margin-left 10mm ";
    switches += "--page-size Letter ";
    p.StartInfo.Arguments = switches + " " + url + " " + fileName;
    p.Start();

    //read output
    byte[] buffer = new byte[32768];
    byte[] file;
    using(var ms = new MemoryStream())
    {
        while(true)
        {
            int read =  p.StandardOutput.BaseStream.Read(buffer, 0,buffer.Length);

            if(read <=0)
            {
                break;
            }
            ms.Write(buffer, 0, read);
        }
        file = ms.ToArray();
    }

    // wait or exit
    p.WaitForExit(60000);

    // read the exit code, close process
    int returnCode = p.ExitCode;
    p.Close();

    return returnCode == 0 ? file : null;
}

感谢格雷厄姆·安布罗斯和其他所有人。

I had the same problem when i tried using msmq with a windows service but it was very slow for some reason. (the process part).

This is what finally worked:

private void DoDownload()
{
    var url = Request.Url.GetLeftPart(UriPartial.Authority) + "/CPCDownload.aspx?IsPDF=False?UserID=" + this.CurrentUser.UserID.ToString();
    var file = WKHtmlToPdf(url);
    if (file != null)
    {
        Response.ContentType = "Application/pdf";
        Response.BinaryWrite(file);
        Response.End();
    }
}

public byte[] WKHtmlToPdf(string url)
{
    var fileName = " - ";
    var wkhtmlDir = "C:\\Program Files\\wkhtmltopdf\\";
    var wkhtml = "C:\\Program Files\\wkhtmltopdf\\wkhtmltopdf.exe";
    var p = new Process();

    p.StartInfo.CreateNoWindow = true;
    p.StartInfo.RedirectStandardOutput = true;
    p.StartInfo.RedirectStandardError = true;
    p.StartInfo.RedirectStandardInput = true;
    p.StartInfo.UseShellExecute = false;
    p.StartInfo.FileName = wkhtml;
    p.StartInfo.WorkingDirectory = wkhtmlDir;

    string switches = "";
    switches += "--print-media-type ";
    switches += "--margin-top 10mm --margin-bottom 10mm --margin-right 10mm --margin-left 10mm ";
    switches += "--page-size Letter ";
    p.StartInfo.Arguments = switches + " " + url + " " + fileName;
    p.Start();

    //read output
    byte[] buffer = new byte[32768];
    byte[] file;
    using(var ms = new MemoryStream())
    {
        while(true)
        {
            int read =  p.StandardOutput.BaseStream.Read(buffer, 0,buffer.Length);

            if(read <=0)
            {
                break;
            }
            ms.Write(buffer, 0, read);
        }
        file = ms.ToArray();
    }

    // wait or exit
    p.WaitForExit(60000);

    // read the exit code, close process
    int returnCode = p.ExitCode;
    p.Close();

    return returnCode == 0 ? file : null;
}

Thanks Graham Ambrose and everyone else.

你的呼吸 2024-08-09 10:42:21

好的,这是一个老问题,但却是一个很好的问题。由于我没有找到好的答案,所以我自己做了一个:) 此外,我已将这个超级简单的项目发布到 GitHub。

以下是一些示例代码:

var pdfData = HtmlToXConverter.ConvertToPdf("<h1>SOO COOL!</h1>");

以下是一些关键点:

  • 无 P/Invoke
  • 无需创建新进程
  • 无文件系统(全部位于 RAM 中)
  • 具有智能感知功能的本机 .NET DLL 等
  • 能够生成 PDF 或 PNG(<代码>HtmlToXConverter.ConvertToPng)

OK, so this is an old question, but an excellent one. And since I did not find a good answer, I made my own :) Also, I've posted this super simple project to GitHub.

Here is some sample code:

var pdfData = HtmlToXConverter.ConvertToPdf("<h1>SOO COOL!</h1>");

Here are some key points:

  • No P/Invoke
  • No creating of a new process
  • No file-system (all in RAM)
  • Native .NET DLL with intellisense, etc.
  • Ability to generate a PDF or PNG (HtmlToXConverter.ConvertToPng)
ぃ弥猫深巷。 2024-08-09 10:42:21

查看 wkhtmltopdf 库的 C# 包装器库(使用 P/Invoke): https://github.com/pruiz/ WkHtmlToXSharp

Check out the C# wrapper library (using P/Invoke) for the wkhtmltopdf library: https://github.com/pruiz/WkHtmlToXSharp

用心笑 2024-08-09 10:42:21

有很多原因可以解释为什么这通常是一个坏主意。如果发生崩溃,您将如何控制生成但最终保留在内存中的可执行文件?拒绝服务攻击或者恶意内容进入 TestPDF.htm 会怎样?

我的理解是ASP.NET用户帐户将没有本地登录的权限。它还需要具有正确的文件权限才能访问可执行文件并写入文件系统。您需要编辑本地安全策略并让 ASP.NET 用户帐户(可能是 ASPNET)在本地登录(默认情况下可能位于拒绝列表中)。然后您需要编辑 NTFS 文件系统上其他文件的权限。如果您处于共享托管环境中,可能无法应用您所需的配置。

使用此类外部可执行文件的最佳方法是将 ASP.NET 代码中的作业排队,并让某种服务监视队列。如果你这样做,你就能保护自己免受各种不好的事情发生。在我看来,更改用户帐户的维护问题不值得付出努力,虽然设置服务或计划作业很痛苦,但这只是一个更好的设计。 ASP.NET 页面应轮询结果队列以获取输出,并且您可以向用户显示等待页面。这在大多数情况下是可以接受的。

There are many reason why this is generally a bad idea. How are you going to control the executables that get spawned off but end up living on in memory if there is a crash? What about denial-of-service attacks, or if something malicious gets into TestPDF.htm?

My understanding is that the ASP.NET user account will not have the rights to logon locally. It also needs to have the correct file permissions to access the executable and to write to the file system. You need to edit the local security policy and let the ASP.NET user account (maybe ASPNET) logon locally (it may be in the deny list by default). Then you need to edit the permissions on the NTFS filesystem for the other files. If you are in a shared hosting environment it may be impossible to apply the configuration you need.

The best way to use an external executable like this is to queue jobs from the ASP.NET code and have some sort of service monitor the queue. If you do this you will protect yourself from all sorts of bad things happening. The maintenance issues with changing the user account are not worth the effort in my opinion, and whilst setting up a service or scheduled job is a pain, its just a better design. The ASP.NET page should poll a result queue for the output and you can present the user with a wait page. This is acceptable in most cases.

痴者 2024-08-09 10:42:21

您可以通过指定“-”作为输出文件来告诉 wkhtmltopdf 将其输出发送到 sout。
然后,您可以将进程的输出读入响应流,并避免写入文件系统时出现权限问题。

You can tell wkhtmltopdf to send it's output to sout by specifying "-" as the output file.
You can then read the output from the process into the response stream and avoid the permissions issues with writing to the file system.

属性 2024-08-09 10:42:21

我对 2018 年事情的看法。

我正在使用异步。我正在流式传输 wkhtmltopdf 和从 wkhtmltopdf 流式传输。我创建了一个新的 StreamWriter,因为 wkhtmltopdf 默认情况下需要 utf-8,但在进程启动时它被设置为其他内容。

我没有包含很多参数,因为这些参数因用户而异。您可以使用additionalArgs 添加您需要的内容。

我删除了 p.WaitForExit(...),因为我没有处理它是否失败,并且它无论如何都会挂在 await tStandardOutput 上。如果需要超时,那么您必须使用取消令牌或超时对不同任务调用 Wait(...) 并进行相应处理。

public async Task<byte[]> GeneratePdf(string html, string additionalArgs)
{
    ProcessStartInfo psi = new ProcessStartInfo
    {
        FileName = @"C:\Program Files\wkhtmltopdf\wkhtmltopdf.exe",
        UseShellExecute = false,
        CreateNoWindow = true,
        RedirectStandardInput = true,
        RedirectStandardOutput = true,
        RedirectStandardError = true,
        Arguments = "-q -n " + additionalArgs + " - -";
    };

    using (var p = Process.Start(psi))
    using (var pdfSream = new MemoryStream())
    using (var utf8Writer = new StreamWriter(p.StandardInput.BaseStream, 
                                             Encoding.UTF8))
    {
        await utf8Writer.WriteAsync(html);
        utf8Writer.Close();
        var tStdOut = p.StandardOutput.BaseStream.CopyToAsync(pdfSream);
        var tStdError = p.StandardError.ReadToEndAsync();

        await tStandardOutput;
        string errors = await tStandardError;

        if (!string.IsNullOrEmpty(errors)) { /* deal/log with errors */ }

        return pdfSream.ToArray();
    }
}

我没有包含在其中的内容,但如果您有图像、CSS 或 wkhtmltopdf 在渲染 html 页面时必须加载的其他内容,这些内容可能会很有用:

  • 使用 --cookie 来传递身份验证 cookie
  • 您可以在 html 页面的标头中 ,您可以使用指向服务器的 href 设置基本标签,如果需要,wkhtmltopdf 将使用它

My take on this with 2018 stuff.

I am using async. I am streaming to and from wkhtmltopdf. I created a new StreamWriter because wkhtmltopdf is expecting utf-8 by default but it is set to something else when the process starts.

I didn't include a lot of arguments since those varies from user to user. You can add what you need using additionalArgs.

I removed p.WaitForExit(...) since I wasn't handling if it fails and it would hang anyway on await tStandardOutput. If timeout is needed, then you would have to call Wait(...) on the different tasks with a cancellationtoken or timeout and handle accordingly.

public async Task<byte[]> GeneratePdf(string html, string additionalArgs)
{
    ProcessStartInfo psi = new ProcessStartInfo
    {
        FileName = @"C:\Program Files\wkhtmltopdf\wkhtmltopdf.exe",
        UseShellExecute = false,
        CreateNoWindow = true,
        RedirectStandardInput = true,
        RedirectStandardOutput = true,
        RedirectStandardError = true,
        Arguments = "-q -n " + additionalArgs + " - -";
    };

    using (var p = Process.Start(psi))
    using (var pdfSream = new MemoryStream())
    using (var utf8Writer = new StreamWriter(p.StandardInput.BaseStream, 
                                             Encoding.UTF8))
    {
        await utf8Writer.WriteAsync(html);
        utf8Writer.Close();
        var tStdOut = p.StandardOutput.BaseStream.CopyToAsync(pdfSream);
        var tStdError = p.StandardError.ReadToEndAsync();

        await tStandardOutput;
        string errors = await tStandardError;

        if (!string.IsNullOrEmpty(errors)) { /* deal/log with errors */ }

        return pdfSream.ToArray();
    }
}

Things I haven't included in there but could be useful if you have images, css or other stuff that wkhtmltopdf will have to load when rendering the html page:

  • you can pass the authentication cookie using --cookie
  • in the header of the html page, you can set the base tag with href pointing to the server and wkhtmltopdf will use that if need be
蝶舞 2024-08-09 10:42:21

感谢您的问题/回答/以上所有评论。我在为 WKHTMLtoPDF 编写自己的 C# 包装器时遇到了这个问题,它解决了我遇到的几个问题。我最终在一篇博客文章中写了这一点 - 其中还包含我的包装器(毫无疑问,您会看到上面的条目中的“灵感”渗透到我的代码中......

) johnnyreilly.com/2012/04/making-pdfs-from-html-in-c-using.html" rel="nofollow noreferrer">使用 WKHTMLtoPDF 在 C# 中从 HTML 制作 PDF

再次感谢大家!

Thanks for the question / answer / all the comments above. I came upon this when I was writing my own C# wrapper for WKHTMLtoPDF and it answered a couple of the problems I had. I ended up writing about this in a blog post - which also contains my wrapper (you'll no doubt see the "inspiration" from the entries above seeping into my code...)

Making PDFs from HTML in C# using WKHTMLtoPDF

Thanks again guys!

烟若柳尘 2024-08-09 10:42:21

ASP .Net 进程可能没有对该目录的写访问权限。

尝试告诉它写入 %TEMP%,看看它是否有效。

另外,让 ASP .Net 页面回显进程的 stdout 和 stderr,并检查错误消息。

The ASP .Net process probably doesn't have write access to the directory.

Try telling it to write to %TEMP%, and see if it works.

Also, make your ASP .Net page echo the process's stdout and stderr, and check for error messages.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文