如何找到” pdf页来源对于现有页面...?

发布于 2025-02-01 16:46:17 字数 207 浏览 2 评论 0原文

嗨,我正在尝试找到页面的原点IE x和y坐标。是否有任何代码示例“使用PDFBox”,并且理论将有助于在PDF中找到页面的来源。

通过说我的意思是,我们需要找到起源是 左下? 右下方? 右上角? 左上角?中间页面?

Hi I am trying to find the origin i.e x and y coordinates of a page is there any code examples "Using PDFBOX" and also theory that will help to find the origin of the page in the PDF.

By saying that i mean , we need to find wether the origin is
left bottom? right bottom? right top? left top ? or from the middle of the page ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

赠佳期 2025-02-08 16:46:17

首先,我认为我们是在谈论用户空间坐标,而不是设备空间坐标。当渲染PDF时,坐标最终被转换为渲染目标的设备空间。但是设备空间坐标是设备依赖性的,因此,并不真正适合通用PDF处理任务。

页面的默认用户空间坐标系统

默认用户空间坐标系统特别用于定位注释,是初始的用户空间坐标系统开始处理页面内容流的指令。

该坐标系由页面的有效裁剪框指定(默认为媒体框):

应将用户空间坐标系统初始化为文档的每个页面的默认状态。页面字典中的 cropbox 输入应指定与预期输出介质(显示窗口或打印页面)相对应的用户空间矩形。正x轴水平向右延伸,正向向上延伸y轴,就像在标准数学实践中一样(如旋转在页面字典中的变化)。

(ISO 32000-2,第8.3.2.3节“用户空间”)

因此,即使在不考虑页面旋转的情况下,该原点也可以在内部,边缘或可见页面外部的任何地方,例如对于以下 cropbox 值:

  • [0 0 612 792] - 左下
  • 中的原点[0 -792 612 0] - 原始左上
  • [-306 -396 306 396] - 页面中心
  • [-1612 1000 -1000 1792] - 右页面的原点>

如果您还考虑了页面旋转,其来源随页面旋转:

类型value
旋转整数(可选; sartherable) 在显示或打印时,该页面应顺时针旋转的度数。该值应为90的倍数。默认值: 0

(ISO 32000-2,表31“页面对象中的条目”)

例如,对于裁剪框[0 0 612 792] for以下 rotate /strong>值:

  • 0 - 左下
  • 90 - 左上方
  • - 原点
  • 180 - 原点 - 右下方

和裁切盒

  • 中的原点
  • 270 代码> 90 - 左右的页面
  • <代码> 180 - 左右的页面
  • 270 - 右侧和更高页

当然,坐标轴更改的指示匹配旋转:

  • 0 - x 坐标会增加右侧, y 向上坐标
  • 90 - x 坐标向下增加, y 坐标到右
  • 180 - x 坐标增加到左, y 向下坐标
  • 270 - x 坐标向上增加, y 坐标在左侧,

当前用户空间 在处理页面内容流的指令时,页面的坐标系

可能会随着用户空间的转换,尤其是 cm 指令:

操作数操作员描述
abcde fcm通过串联指定的矩阵来修改当前变换矩阵(CTM)(请参阅8.3.2,“坐标空间”)。尽管操作数指定矩阵,但它们应将其写入六个单独的数字,而不是数组。

(ISO 32000-2,表56“图形状态运算符”)

一个用例是在旋转后将当前的坐标系“右侧向上”。

例如,对于裁切框[0 0 612 792]和页面旋转90,坐标系统起源于左上方, x 坐标向下增加, y 坐标向右增加。要弄清此问题,您通常会在页面内容流的开头时找到这样的 cm 指令:

0 1 -1 0 612 0 cm

在此说明之后,我们示例中旋转页面上的来源再次在左下方, x 坐标增加到右侧, y 向上坐标。

First of all, I assume we are talking about user space coordinates, not device space coordinates. When rendering a PDF, coordinates eventually are translated to the device space of the rendering target. But device space coordinates are device dependent and, therefore, not really appropriate for generic PDF processing tasks.

The default user space coordinate system of a page

The default user space coordinate system is in particular used for positioning annotations and is the initial user space coordinate system when starting to process the instructions of the page content stream.

This coordinate system is specified by the effective crop box of the page (which defaults to its media box):

The user space coordinate system shall be initialised to a default state for each page of a document. The CropBox entry in the page dictionary shall specify the rectangle of user space corresponding to the visible area of the intended output medium (display window or printed page). The positive x axis extends horizontally to the right and the positive y axis vertically upward, as in standard mathematical practice (subject to alteration by the Rotate entry in the page dictionary).

(ISO 32000-2, section 8.3.2.3 "User space")

Thus, even without considering the page rotation, the origin may be anywhere inside, on the edge, or outside the visible page area, e.g. for the following CropBox values:

  • [ 0 0 612 792 ] - origin in the lower left
  • [ 0 -792 612 0 ] - origin in the upper left
  • [ -306 -396 306 396 ] - origin in the center of the page
  • [ -1612 1000 -1000 1792 ] - origin off page to the right and below

If you also take page rotation into account, the origin rotates with the page:

KeyTypeValue
Rotateinteger(Optional; inheritable) The number of degrees by which the page shall be rotated clockwise when displayed or printed. The value shall be a multiple of 90. Default value: 0.

(ISO 32000-2, Table 31 "Entries in a page object")

So e.g. for the crop box [ 0 0 612 792 ] for the following Rotate values:

  • 0 - origin in the lower left
  • 90 - origin in the upper left
  • 180 - origin in the upper right
  • 270 - origin in the lower right

and for the crop box [ -1612 1000 -1000 1792 ]:

  • 0 - origin off page to the right and below
  • 90 - origin off page to the left and below
  • 180 - origin off page to the left and above
  • 270 - origin off page to the right and above

Of course also the directions of the coordinate axis change matching the rotation:

  • 0 - x coordinates increase to the right, y coordinates upwards
  • 90 - x coordinates increase downwards, y coordinates to the right
  • 180 - x coordinates increase to the left, y coordinates downwards
  • 270 - x coordinates increase upwards, y coordinates to the left

The current user space coordinate system of a page

While processing the instructions of a page content stream, the user space may be transformed along, in particular by the cm instruction:

OperandsOperatorDescription
a b c d e fcmModify the current transformation matrix (CTM) by concatenating the specified matrix (see 8.3.2, "Coordinate spaces"). Although the operands specify a matrix, they shall be written as six separate numbers, not as an array.

(ISO 32000-2, Table 56 "Graphics state operators")

One use case for this is to have the current coordinate system "the right side up" after rotation.

For example for the crop box [ 0 0 612 792 ] and the page rotation 90, the coordinate system has its origin in the upper left, x coordinates increase downwards, and y coordinates increase to the right. To straighten this out, you'll often find a cm instruction like this at the start of the page content stream:

0 1 -1 0 612 0 cm

After this instruction the origin on the rotated page in our example is again in the lower left, and x coordinates increase to the right and y coordinates upwards.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文