PDF 名称树限制

发布于 2025-01-01 21:25:51 字数 1245 浏览 0 评论 0原文

我有一个 PDF 文件,尝试使用 PDF 渲染器 进行解析,但存在以下问题:

(1 ) 一些名称树定义了限制,下限或上限为 NULL。该规范并没有真正说明如何处理这些问题:

(Intermediate and leaf nodes only; required) An array of two strings, specifying
the (lexically) least and greatest keys included in the Names array of a leaf 
node or in the Names arrays of any leaf nodes that are descendants of an
intermediate node. 

因此,如果任何边界为空,我基本上假设一个开放范围,如果两个限制都为空,我将尝试在名称中找到键。这个假设正确吗?

(2) 在同一个 PDF 文件中,当根据 (1) 进行假设时,即使我正在寻找的密钥符合限制定义的范围,但该密钥不存在,但必须查看以下孩子。我想这仍然是正确的?

(3) 最后,仍然在同一个 PDF 文件中,有一些名称不遵循

key1 value1 key2 value2 ... keyn valuen

规范中定义的顺序,而是以值开始:

value0 key1 value1 ... keyn valuen

并以值结束。因此,在这种情况下,我只是跳过第一个值,冒着映射错误的风险。再说一次,对吗?

我的猜测是:

  • 要么 PDF 格式不正确,
  • 要么它使用了一些 1.6 功能,这些功能完全混淆了库并导致上面列出的症状

我想对库进行更改以处理有问题的 PDF 文件,而不破坏现有代码。

更新:为了纠正这个问题,我最终决定不处理上述所有内容,而是在其他地方解决这个问题。问题最初是在阅读大纲中的动作时出现的。现在,可能的“错误”操作将被忽略。 这是相应的补丁。

I have a PDF file that I try to parse using PDF Renderer and have the following issues:

(1) Some of the name trees have Limits defined with either the lower or upper bound is NULL. The specification doesn't really say anything on how to deal with those:

(Intermediate and leaf nodes only; required) An array of two strings, specifying
the (lexically) least and greatest keys included in the Names array of a leaf 
node or in the Names arrays of any leaf nodes that are descendants of an
intermediate node. 

So I am basically assuming an open range if any of the bounds is null, if both limits are null, I'll try to find the key in the names. Is this assumption correct?

(2) In the same PDF file when making the assumption from (1), even though the key I am looking for would fit in the range defined by the limits, the key is not present but have to look at following kids. I guess this is still correct?

(3) Finally, and still in the same PDF file, there are Names that don't follow the

key1 value1 key2 value2 ... keyn valuen

sequence defined in the specification but start with a value:

value0 key1 value1 ... keyn valuen

and ends with a value. So in this case, I just skip the first value at the risk of having the mapping wrong. Again, correct?

My guess is that:

  • either the PDF is not well formed
  • either it uses some 1.6 functionality that totally confuses the library and leads to the symptoms listed above

I would like to make changes to the library to handle the PDF file in question, without breaking the existing code.

Update: to correct this issue, I finally decided to not deal with all the above but address the issue somewhere else. The problem originally came when reading an action in the outline. Now the presumably "faulty" action will simply be ignored. This is the corresponding patch.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文