从 PDF 文档中删除超链接 (iTextSharp)

发布于 2024-11-04 10:56:02 字数 223 浏览 1 评论 0原文

我正在尝试利用 iTextSharp(该产品非常新)从 PDF 文档中删除超链接。有谁知道这是否可能?我一直在研究 API,但没有找到明显的方法来做到这一点。

我的问题是,我正在对一个嵌入 iframe 中的 PDF 的系统进行维护,并且 PDF 中的链接导致用户最终在 iframe 中而不是在新窗口或选项卡中浏览网站,所以我正在寻找了解一种在请求时删除 PDF 中链接的方法。

提前致谢, 斯科特

I'm trying to leverage iTextSharp (very new to the product) to remove hyperlinks from a PDF document. Does anyone know if this is possible? I've been digging through the API and haven't found an obvious way to do this.

My problem is that I'm doing maintenance on a system that has PDFs empbedded in an iframe and the links within the PDF are causing users to end up browsing the site within the iframe rather than in a new window or tab so I'm looking for a way to kill the links in the PDF at request time.

Thanks in advance,
Scott

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

Hello爱情风 2024-11-11 10:56:02

人们点击的链接是给定页面的 /Annots 数组中的注释。

您有两个选择:

  1. 销毁整个 /Annots 数组
  2. 搜索 /Annots 数组并删除所有链接注释

简单地销毁注释数组很容易:

 PdfDictionary pageDict = reader.getPageN(1); // 1st page is 1
 pageDict.remove(PdfName.ANNOTS);

 stamper.close();

问题是您可能会销毁您想要保留的注释以及您不想保留的注释't。

解决方案是搜索 annot 数组以查找 URL 链接。

PdfDictionary pageDict = reader.getPageN(1);
PdfArray annots = pageDict.getAsArray(PdfName.ANNOTS);
PdfArray newAnnots = new PdfArray();
if (annots != null) {
  for (int i = 0; i < annots.size(); ++i) {
    PdfDictionary annotDict = annots.getAsDict(i);
    if (!PdfName.LINK.equals(annotDict.getAsName(PdfName.SUBTYPE))) {
      // annots are actually listed as PdfIndirectReference's.  
      // Adding the dict directly would be A Bad Thing.
      newAnnots.add(annots.get(i));// get the original reference, not the dict
    }
  }
  pageDict.put(PdfName.ANNOTS, newAnnots);
}

这将删除所有链接注释,而不仅仅是链接到内部网站的链接注释。如果您需要深入挖掘,则需要查看 PDF 规范,第 12.5.6.5 节(链接注释)和第 12.6.4.7 节(URI 操作)。

The links people click on are annotations in a given page's /Annots array.

You have two options:

  1. Destroy the entire /Annots array
  2. Search through the /Annots array and remove all the link annotations

Simply blasting the annotation array is easy:

 PdfDictionary pageDict = reader.getPageN(1); // 1st page is 1
 pageDict.remove(PdfName.ANNOTS);

 stamper.close();

The problem is that you might be destroying annotations that you want to keep along with those you don't.

The solution is to search the annot array looking for links to URLs.

PdfDictionary pageDict = reader.getPageN(1);
PdfArray annots = pageDict.getAsArray(PdfName.ANNOTS);
PdfArray newAnnots = new PdfArray();
if (annots != null) {
  for (int i = 0; i < annots.size(); ++i) {
    PdfDictionary annotDict = annots.getAsDict(i);
    if (!PdfName.LINK.equals(annotDict.getAsName(PdfName.SUBTYPE))) {
      // annots are actually listed as PdfIndirectReference's.  
      // Adding the dict directly would be A Bad Thing.
      newAnnots.add(annots.get(i));// get the original reference, not the dict
    }
  }
  pageDict.put(PdfName.ANNOTS, newAnnots);
}

This will remove all link annotations, not just those that link to internal sites. If you need to dig deeper, you'll need to check out the PDF Spec, section 12.5.6.5 (link annotations) and section 12.6.4.7 (URI actions).

溇涏 2024-11-11 10:56:02

使用 PDFSharp 您可以这样做:

  void RemoveHyperlinks (string sourcePDF, string targetPDF) {
            using (PdfDocument PDFDoc = PdfReader.Open (sourcePDF, PdfDocumentOpenMode.Import)) {
                using (PdfDocument PDFNewDoc = new PdfDocument ()) {
                    // Copy pages to new doc
                    for (int Pg = 0; Pg < PDFDoc.Pages.Count; Pg++) {
                        PdfPage page = PDFDoc.Pages[Pg];
                        //page.HasAnnotations
                        page.Annotations.Clear();
                        var newPage = PDFNewDoc.AddPage(page);
                    } // for

                    PDFNewDoc.Save (targetPDF);
                } // using 
            } // using 
        }
    }

With PDFSharp you can do this that way:

  void RemoveHyperlinks (string sourcePDF, string targetPDF) {
            using (PdfDocument PDFDoc = PdfReader.Open (sourcePDF, PdfDocumentOpenMode.Import)) {
                using (PdfDocument PDFNewDoc = new PdfDocument ()) {
                    // Copy pages to new doc
                    for (int Pg = 0; Pg < PDFDoc.Pages.Count; Pg++) {
                        PdfPage page = PDFDoc.Pages[Pg];
                        //page.HasAnnotations
                        page.Annotations.Clear();
                        var newPage = PDFNewDoc.AddPage(page);
                    } // for

                    PDFNewDoc.Save (targetPDF);
                } // using 
            } // using 
        }
    }
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文