将 pdf 表单文件与 xml 数据合并/填充

发布于 2024-08-28 20:29:35 字数 549 浏览 5 评论 0原文

假设我在网站上有一个 pdf 表单文件,该文件由用户填写并提交到服务器。在服务器端(Asp.Net),我想将收到的 xml 格式的数据与已填写的空 pdf 表单合并并保存。

我发现有几种可能的方法:

  1. 使用 adobe acrobat 创建的 pdf 表单并用 itextsharp 填充它。
  2. 使用adobe acrobat创建的pdf表格并使用FDF Toolkit .net(内部似乎使用itextsharp)
  3. Usd pdfkt填写表格。
  4. 使用使用 adobe livecycle 创建的 pdf 表单文件,并使用 合并数据表单数据集成服务

由于我没有此类任务的经验,您能否建议哪个选项更好/更简单,并提供一些额外的提示?

先感谢您。

Let's say I have a pdf form file available at website which is filled by the users and submitted to the server. On the server side (Asp.Net) I would like to merge the data that I receive in xml format with the empty pdf form that was filled and save it.

As I have found there are several possible ways of doing it:

  1. Using pdf form created by adobe acrobat and filling it with itextsharp.
  2. Using pdf form created by adobe acrobat and filling it with FDF Toolkit .net (which seems to be using itextsharp internally)
  3. Usd pdfkt to fill the form.
  4. Use pdf form file created with adobe livecycle and merge the data by using Form Data Integration Service

As I have no experience with this kind of task can you advise which option would be better/easier and give some additional tips?

Thank you in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

三寸金莲 2024-09-04 20:29:35

如果可能的话,我建议使用第四种方法,因为它会更干净。您将使用专门针对您要求执行的操作量身定制的解决方案,但如果您没有此类解决方案的可用资源,我建议使用第一个选项。

第一个选项是我最近研究的。我发现实施起来相对轻松。

如果满足以下条件,则可以选择选项 1:

  1. 您可以控制 PDF 表单的开发。
  2. 您可以控制 xml 数据的格式
  3. 您可以接受未压缩 (fastweb=false) PDF 文件

实施示例:

  1. 使用 Adob​​e Acrobat 生成 PDF 表单。提示:生成表单时使用 Adob​​e Native Fonts。对于您添加的每个非本机字体的控件,它将导入所使用的字体,并在未压缩时使文件膨胀,据我所知,ITextSharp 目前不会生成压缩的 PDF。

  2. 使用 ITextSharp 库将 XML 数据与 PDF 表单结合起来生成填充文档。提示:要从 xml 手动填充 PDF 表单,您必须将 xml 值映射到 PDF 表单中的控件名称,并按页面匹配它们,如下例所示。

    使用 (MemoryStream 流 =GeneratePDF(m_FormsPath, oXmlData))
    {
          byte[] 字节=stream.ToArray();
          Response.ContentType = "应用程序/pdf";
          Response.BinaryWrite(字节);
          响应.End();
    }
    
    // <摘要>
    // 该方法将 pdf 表单与 xml 数据结合起来
    // 
    // pdf表单文件路径
    // xml 数据集;
    // 包含 pdf 数据的内存流
    私有 MemoryStreamGeneratePDF(字符串 m_FormName, XmlDocument oData)
    {
    Pdf阅读器pdf模板;
    PdfStamper 压模;
    PdfReader tempPDF;
    文档文档;
    内存流 msTemp;
    PdfWriter pCopy;
    MemoryStream msOutput = new MemoryStream();
    
    pdfTemplate = new PdfReader(m_FormName);
    
    doc = 新文档();
    pCopy = new PdfCopy(doc, msOutput);
    
    pCopy.AddViewerPreference(PdfName.PICKTRAYBYPDFSIZE, new PdfBoolean(true));
    pCopy.AddViewerPreference(PdfName.PRINTSCALING, PdfName.NONE);
    
    文档.打开();
    
    for (int i = 1; i < pdfTemplate.NumberOfPages + 1; i++)
    {
        msTemp = 新的 MemoryStream();
        pdfTemplate = new PdfReader(m_FormName);
    
        stamper = new PdfStamper(pdfTemplate, msTemp);
    
        // 将 xml 值映射到 pdf 表单控件(元素名称 = 控件名称)
        foreach (oData.SelectNodes("/form/page" + i + "/*") 中的 XmlElement oElem)
        {
            stamper.AcroFields.SetField(oElem.Name, oElem.InnerText);
        }
    
        压模.FormFlattening = true;
        关闭();
        tempPDF = new PdfReader(msTemp.ToArray());
        ((PdfCopy)pCopy).AddPage(pCopy.GetImportedPage(tempPDF, i));
        pCopy.FreeReader(tempPDF);
    }
    
    文档.关闭();
    
    返回 msOutput;
    }
    
  3. 保存文件或将文件发布到 ASP.Net 页面的响应

I would suggest using the 4th approach if possible because it would be cleaner. You would be using solutions specifically tailored for what you are asking to do, but if you don't have the available resources for such a solution I would suggest using the 1st option.

The 1st option is what I have recently dove into. I have found it relatively painless to implement.

Option 1 is possible if the following applies:

  1. You have control of development of PDF forms.
  2. You have control of formating xml data
  3. You have can live with having uncompressed (fastweb=false) PDF files

Example of implementation:

  1. Using Adobe Acrobat to generate a PDF form. Tip: Use Adobe Native Fonts when generating the forms. For each control you add that is not a native font it will import the font used and bloat the file when it is not compressed, and to my knowledge ITextSharp currently does not produce compressed PDFs.

  2. Using ITextSharp Library to combine XML data with the PDF form to generate a populated document. Tip: to manually populate a PDF form from xml you must map xml values to control names in the PDF form and match them by page as shown in the example below.

    using (MemoryStream stream = GeneratePDF(m_FormsPath, oXmlData))
    {
          byte[] bytes = stream.ToArray();
          Response.ContentType = "application/pdf";
          Response.BinaryWrite(bytes);
          Response.End();
    }
    
    // <summary>
    // This method combines pdf forms with xml data
    // </summary>
    // <param name="m_FormName">pdf form file path</param>
    // <param name="oData">xml dataset</param>
    // <returns>memory stream containing the pdf data</returns>
    private MemoryStream GeneratePDF(string m_FormName, XmlDocument oData)
    {
    PdfReader pdfTemplate;
    PdfStamper stamper;
    PdfReader tempPDF;
    Document doc;
    MemoryStream msTemp;
    PdfWriter pCopy;
    MemoryStream msOutput = new MemoryStream();
    
    pdfTemplate = new PdfReader(m_FormName);
    
    doc = new Document();
    pCopy = new PdfCopy(doc, msOutput);
    
    pCopy.AddViewerPreference(PdfName.PICKTRAYBYPDFSIZE, new PdfBoolean(true));
    pCopy.AddViewerPreference(PdfName.PRINTSCALING, PdfName.NONE);
    
    doc.Open();
    
    for (int i = 1; i < pdfTemplate.NumberOfPages + 1; i++)
    {
        msTemp = new MemoryStream();
        pdfTemplate = new PdfReader(m_FormName);
    
        stamper = new PdfStamper(pdfTemplate, msTemp);
    
        // map xml values to pdf form controls (element name = control name)
        foreach (XmlElement oElem in oData.SelectNodes("/form/page" + i + "/*"))
        {
            stamper.AcroFields.SetField(oElem.Name, oElem.InnerText);
        }
    
        stamper.FormFlattening = true;
        stamper.Close();
        tempPDF = new PdfReader(msTemp.ToArray());
        ((PdfCopy)pCopy).AddPage(pCopy.GetImportedPage(tempPDF, i));
        pCopy.FreeReader(tempPDF);
    }
    
    doc.Close();
    
    return msOutput;
    }
    
  3. Save the File or post the file to the response of your ASP.Net page

玉环 2024-09-04 20:29:35

由于您标记了此“LiveCycle”,我认为您已在某处安装了 Adob​​e LiveCycle(可选,可以将其安装在某处)。

在这种情况下,我会选择第 4 种(使用 Adob​​e LiveCycle Forms ES 模块进行修改)。从长远来看,其他三个无疑会产生兼容性问题。借助 LiveCycle 服务器(运行表单模块),您将能够处理任何 PDF,无论它是旧的、新的、静态的、动态的、压缩的、基于 Acrobat 的还是基于 LiveCycle 的。

您应该能够进行设置,让表单将其数据发送到 LiveCycle 服务器,并使用该数据填充表单。然后,填充内容可以存储在服务器的数据库中,或者路由到 PDF 表单(或任何其他表单)中并流回客户端。

使用 LiveCycle Designer 创建表单。

快速而肮脏的选项如下:将表单设置为 http-post(例如 xfdf,请参阅 Acrobat 了解更多信息)到 ASP 服务器并将其发布到服务器上(确保您的用户不这样做)在打开表单之前不要下载该表单,否则该表单必须在网络浏览器中打开。然后只需捕获提交内容,就像从网页捕获 http-post 一样。 (可选)将填充保存到数据库中。然后将捕获的 xfdf 流填充发送回客户端(也可以在稍后阶段通过 http 链接调用)。 xfdf 流将包含用于填写的表单的 URL。客户端 Web 浏览器将要求 Acrobat/Adobe 阅读器插件处理 xfdf 流,并且该插件将定位、下载并填充 xfdf 指向的表单。

用户现在应该能够保存表单并填写 - 不需要阅读器扩展!

Since you tagged this 'LiveCycle', I take it you have an installation of Adobe LiveCycle running somewhere (optionally, can install it somewhere).

In that case, I'd go for number 4 (with the modification of using the Adobe LiveCycle Forms ES module). The other three will undoubtedly yield compatibility issues in the long run. With the LiveCycle server (running the Forms module), you'll be able to handle any PDF, whether it's old, new, static, dynamic, compressed, Acrobat-based or LiveCycle-based.

You should be able to set things up, have the form send its data to the LiveCycle server, and use that data to populate the form. The fill can then be stored in the server's database, or routed into the PDF form (or any other form) and streamed back to the client.

Create the form using LiveCycle Designer.

The quick-and-dirty-option would be the following: Set the form to http-post (as for example an xfdf, see Acrobat for more info) to your ASP-server and publish it on the server (make sure your users don't download the form before opening it, otherwise this won't work. The form has to be opened in the web browser). Then simply capture the submissions as you would capture a http-post from a web page. Optionally, save the fill to a database. Then send the captured xfdf stream fill back to the client (could also be invoked at a later stage via a http-link). The xfdf stream will contain the URL of the form used to fill it out. The client web browser will ask the Acrobat/Adobe reader plug to handle the xfdf stream, and the plug will locate, download and populate the form pointed to by the xfdf.

The user should now be able to save the form AND it's fill - no Reader Extension needed!

书信已泛黄 2024-09-04 20:29:35

您还可以使用iTextSharp将xml数据填充到阅读器扩展启用的表单中。您需要正确设置两件事:

  1. 设置PdfReader.unethicalreading = true以防止BadPasswordException。
  2. 在PdfStamper的构造函数中设置附加模式,否则Adobe Reader Extensions 签名损坏,Adobe Reader 将显示以下消息:“此文档包含在 Adob​​e Reader 中启用特殊功能的某些权利。该文档自创建以来已被更改,这些权利不再有效。请联系作者获取本文档的原始版本。”

因此,您所需要做的就是:

PdfReader.unethicalreading = true;
using (var pdfReader = new PdfReader("form.pdf"))
{
    using (var outputStream = new FileStream("filled.pdf", FileMode.Create, FileAccess.Write))
    {
        using (var stamper = new iTextSharp.text.pdf.PdfStamper(pdfReader, outputStream, '\0', true))
        {
            stamper.AcroFields.Xfa.FillXfaForm("data.xml");
        }
    }
}

请参阅 如何使用 iText 填写 XFA 表单?

You can also use iTextSharp to fill xml data into a Reader Extension enabled form. There are two things you need to set correctly:

  1. Set PdfReader.unethicalreading = true to prevent BadPasswordException.
  2. Set append mode in PdfStamper's constructor, otherwise the Adobe Reader Extensions signature becomes broken and Adobe Reader will display following message: "This document contained certain rights to enable special features in Adobe Reader. The document has been changed since it was created and these rights are no longer valid. Please contact the author for the original version of this document."

So all you need to do is this:

PdfReader.unethicalreading = true;
using (var pdfReader = new PdfReader("form.pdf"))
{
    using (var outputStream = new FileStream("filled.pdf", FileMode.Create, FileAccess.Write))
    {
        using (var stamper = new iTextSharp.text.pdf.PdfStamper(pdfReader, outputStream, '\0', true))
        {
            stamper.AcroFields.Xfa.FillXfaForm("data.xml");
        }
    }
}

See How to fill XFA form using iText?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文