如何在 PDFSharp 中遍历 Pdf 对象树?
我正在尝试使用 C# 中的 PDFSharp 遍历现有 PDF 文档中的 PdfItem 对象树。
我想创建一个所有对象的层次结构 - 类似于“PDF Explorer”示例所做的 - 但我希望它是一棵树而不是所有对象的平面列表。
根节点是 document.Internals.Catalog。 我想遍历所有 document.Internals.Catalog.Elements 直到访问完每个元素。
我遇到的问题之一是树中存在循环引用,我不知道如何检测它们。
有代码示例吗?
I am trying to to walk though the tree of PdfItem objects in an existing PDF document using PDFSharp in c#.
I want to create a hierarchy of all the objects as I go along -- similar to what the "PDF Explorer" example does -- but I want it to be a tree instead of a flat list of all the objects.
The root node is document.Internals.Catalog. And I want to to walk down through all the document.Internals.Catalog.Elements until I have visited every element.
One of the problems I run into is that there are circular references in the tree and I can't figure out how to detect them.
Any code samples out there?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
marihanzo 在 PDFSharp 论坛上发表的这篇文章对我们很有用:
http://forum.pdfsharp.net/viewtopic.php?f=2&t=527&p=1603
我们遇到的唯一问题是处理其中包含 \r\n 的字段。 这是代码的副本,以防论坛帖子丢失。
PDFParser.cs
及调用代码:
This post by marihanzo on the PDFSharp forums has worked for us:
http://forum.pdfsharp.net/viewtopic.php?f=2&t=527&p=1603
The only issue we've had was handling fields with \r\n in them. Here is a copy of the code in case the forum post gets lost.
PDFParser.cs
and the calling code:
阅读并分析整个集合,并构建您自己的内存树。 然后走那棵树。
Read and analyze the entirety of the collection, and build an in-memory tree of your own. Then walk that tree.