PDFPage 的分辨率?

发布于 2024-09-14 02:54:54 字数 545 浏览 3 评论 0原文

我有一个 PDF 文档,它是通过创建大小为 72dpi pts 的 NSImages 创建的,每个文档都有一个以像素为单位的表示形式。然后我使用 initWithImage 将这些图像放入 PDFPages 中,然后保存文档。

当我打开文档时,我需要原始图像的分辨率。然而,PDFPage 给我的所有矩形都是以点为单位而不是像素来测量的。

我知道信息就在那里,我想我可以尝试通过 voyeur.app 示例自己解析 PDF 数据......但要做一些应该很正常的事情需要付出很大的努力...... ?

有更简单的方法吗

添加:

我尝试了两种技术:

  1. 从以下位置获取 PDFRepresentation 数据 页面,并用它来制作一个新的 NSImage 通过 initWithData。这 然而,该图像具有两个功能 大小和像素大小(72dpi)。

  2. 将 PDFPage 绘制到新的 离屏上下文,然后得到 来自那的CG图像。问题是 当我构建上下文时 看来我需要知道尺寸 已经以像素为单位,这击败了 部分目的...

I have a PDF document that is created by creating NSImages with size in 72dpi pts, each has a single representation which is measured in pixels. I then put these images into PDFPages with initWithImage, and then save the document.

When I open the document, I need the resolution of the original image. However, all of the rectangles that PDFPage gives me are measured in points, not pixels.

I know that the information is in there, and I suppose I can try to parse the PDF data myself, by going through the voyeur.app example... but that's a WHOLE lot of effort to do something that should be pretty normal...

Is there an easier way to do this?

Added:

I've tried two techniques:

  1. get the PDFRepresentation data from
    the page, and use it to make a new
    NSImage via initWithData. This
    works, however, the image has both
    size and pixel size in 72dpi.

  2. Draw the PDFPage into a new
    off-screen context, and then get a
    CGImage from that. The problem is
    that when I'm making the context, it
    appears that I need to know the size
    in pixels already, which defeats
    part of the purpose...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

能怎样 2024-09-21 02:54:54

关于 PDF,您需要了解一些事情:

  • PDF 坐标系位于
    默认情况下为点(1/72 英寸)。

  • PDF 坐标系没有分辨率。 (这是一个善意的谎言 - 分辨率实际上是 32 位浮点数的限制)。

  • PDF 中的图像本质上不附加任何分辨率(这是一个善意的谎言 - 使用 JPEG2000 压缩的图像在其嵌入的元数据中仍然具有分辨率)。

  • PDF 中的图像由包含一系列使用某种压缩过滤器存储的样本的对象表示。

  • 图像对象可以在页面上以任意大小多次呈现。

由于分辨率定义为每单位距离的像素(或样本)数量,因此分辨率仅意味着页面上图像的特定渲染。因此,如果您要渲染特定图像来填充页面,则 dpi 中的分辨率为

xdpi = image_width / (pageWidthInPoints / 72.0);
ydpi = image_height / (pageHeightInPoints / 72.0);

如果图像未渲染为页面的完整尺寸,则完整的解决方案非常棘手。 Adobe 规定图像应被视为 1x1,并且您可以更改页面转换矩阵来确定如何渲染它们。这意味着您在渲染图像时需要矩阵,并且需要将点 (0,0)、(0, 1)、(1,0) 推过矩阵。 (0, 0)' 和 (1, 0)' 之间的欧几里得距离将为您提供以点为单位的宽度,(0, 0)' 和 (0, 1)' 之间的欧几里得距离将为您提供以点为单位的高度。

那么如何得到这个矩阵呢?那么,您需要页面的内容流,并且需要编写一个 PDF 解释器来提取内容流并跟踪 CTM 的更改。当您到达图像时,您可以提取它的 CTM。

如果您熟悉该工具包,使用像样的 PDF 工具包完成最后一步大约需要一个小时。编写该工具包需要几个人多年的工作。

There are a few things you need to understand about PDF:

  • The PDF Coordinate system is in
    points (1/72 inch) by default.

  • The PDF Coordinate system is devoid of resolution. (this is a white lie - the resolution is effectively the limits of 32 bit floating point numbers).

  • Images in PDF do not inherently have any resolution attached to them (this is a white lie - images compressed with JPEG2000 still have resolution in their embedded metadata).

  • An Image in PDF is represented by an object that contains a series of samples that are stored using some compression filter.

  • Image objects can be rendered on a page multiple times at any size.

Since resolution is defined as the number of pixels (or samples) per unit distance, resolution only means something for a particular rendering of an image on a page. So if you are rendering a particular image to fill the page, then the resolution in dpi is

xdpi = image_width / (pageWidthInPoints / 72.0);
ydpi = image_height / (pageHeightInPoints / 72.0);

If the image is not being rendered to the full size of the page, a complete solution is very tricky. Adobe prescribes that images should be treated as being 1x1 and that you change the page transformation matrix to determine how to render them. The means that you would need the matrix at the point of rendering the image and you would need to push the points (0,0), (0, 1), (1,0) through the matrix. The Euclidean distance between (0, 0)' and (1, 0)' will give you the width in points and the Euclidean distance between (0, 0)' and (0, 1)' will give you the height in points.

So how do you get that matrix? Well, you need the content stream for the page and you need to write a PDF interpreter that can rip the content stream and keep track of changes to the CTM. When you reach your image, you extract the CTM for it.

To do that last step should be about an hour with a decent PDF toolkit, provided you are familiar with the toolkit. Writing that toolkit is several person years of work.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文