PDF 压缩 Adob​​e 是如何做到的?

发布于 2024-08-11 08:56:58 字数 321 浏览 7 评论 0原文

这是一个有趣的问题,而不是一个严肃的问题,但是 Adob​​e PDF 格式如何使文档如此......便携?

我刚刚创建了一个小型 Word 文档,大小为 235kb,包含多张彩色照片和一些文本短语。使用 CutePDF 创建的 PDF(据我所知这不是最有效的 PDF 创建方法)只有 176kb。即 25% 的压缩比。当这些文件放入压缩文件夹时,PDF 能够压缩 3%,而 .docx 只能压缩 2%。我确信较大的文件在大小上会有更大的差异。

我的问题是,Adobe 如何使他们的文件变得如此之小?我知道它们是从光栅图形中绘制的,但是我的 3 个位图文件确实无法从光栅中得到那么多帮助,不是吗?

This is a bit more of a fun question than a serious one, but how does the Adobe PDF format make documents so... portable?

I just created a small Word document, 235kb in size, containing multiple color photos and a few textual phrases. A PDF created using CutePDF (which I understand isn't the most efficient method of PDF creation) is only 176kb. That's a 25% compression ratio. When those files are placed into a compressed folder, the PDF is capable of 3% compression where the .docx can only take 2%. I'm sure that larger files would have even greater differences in size.

My question is, how does Adobe manage to make their files so much smaller? I understand that they are drawn from raster graphics, but my 3 bitmap files really can't be helped from raster that much, can they?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

虐人心 2024-08-18 08:56:58

如果您有 Acrobat 9,则有一个很好的内置工具,因此您可以查看 PDF 是如何组合在一起的(以及使用的压缩)。有一篇博客文章解释了如何使用它 http ://pdf.jpedal.org/java-pdf-blog/bid/10479/Viewing-PDF-objects

If you have Acrobat 9 there is a nice tool built-in so you can see how the PDF was put together (and compressions used). There is a blog post explaining how to use it at http://pdf.jpedal.org/java-pdf-blog/bid/10479/Viewing-PDF-objects

千寻… 2024-08-18 08:56:58

有几种方法可以对其进行压缩:

  1. Pdf 文件使用 lzw 和 zip 压缩。

  2. 如果图像在文档中进行了缩放,或者磁盘上的 dpi 比您在cutepdf 中允许的更大(例如,如果cutepdf 设置为 300dpi 而图像为 600 dpi),则可以在 pdf 中对其进行缩放.

  3. Microsoft 以 docx 格式和 xml 存储大量信息。远远超出了导出信息的实际需要(例如,尝试将文本复制并粘贴到文本框单元格中,然后查看出现的 html 信息 - 我对 cms 的文本框大小有限制,并且7 个单词的句子激增至 950 个字符)。这是为了以后可以对其进行编辑,并包含大量深奥的信息,以确保所有内容在每种可能的排列中都正确显示。 pdf 不需要这些信息,因此它可以只处理字体和大小,并删除所有不必要的信息,从而节省大量空间。

There are a few ways it can be compressing this:

  1. Pdf files use lzw and zip compression.

  2. If the image is scaled in the document, or is a larger dpi on disk than you allow for in cutepdf (for example, if cutepdf is set for 300dpi and the image is 600 dpi), it can be scaled in the pdf.

  3. Microsoft stores TONS of info in the docx format, in xml. WAY more than is really needed to just export the info (for an example, try copying and pasting your text into a textbox cell, and look at the html info that comes out - I had a limit on a textbox size for a cms, and a 7 word sentence ballooned to 950 characters). This is so it can be later edited, and with a lot of esoteric info to make sure everything displays right in every possible permutation. The pdf doesn't need that info, and so it can just do the font and size, and strip out all the unnecessary info, saving a ton of space.

岁月静好 2024-08-18 08:56:58

当您使用如此小的文件时,文档格式中的任何开销都会产生不成比例的影响,这就是为什么您会看到如此大的%差异。

我取了一个 2683KB 的 JPEG 并将其插入到一个新的 word 2003 文档中。生成的 .doc 文件为 2725KB(或 docx 为 2697KB)。将其转换为 PDF 后,我会得到一个 2701KB 的 PDF。因此,我发现存在 25KB 的差异,但由于图像数据的大小,差异仅为 1% 左右。这大约是你得到的一半,但也许你的word版本在制作docx时更冗长?

对于 PDF,acrobat 将空间使用情况显示为 2691K ​​图像、8.27K 开销和 1K 字体。 PDF 在语法上是一种相当稀疏的格式,这限制了开销,并且其中大部分具有重复字符串,因此很容易压缩。

如果您想以树状视图查看 PDF 包含的内容,可以下载 的演示版本CosEdit

When you use such small files any overhead in the document format will have a disproportionate effect which is why you are seeing such large % differences.

I took a 2683KB JPEG and inserted it into a new word 2003 document. The resulting .doc file was 2725KB (or 2697KB as docx). Turning this into a PDF gives me a 2701KB PDF. So I am seeing a difference of 25KB, but only about 1% difference because of the size of the image data. It is about half what you got but maybe the version of word you have is more verbose when making docx?

For the PDF, acrobat shows space usage as 2691K image, 8.27K overhead and 1K fonts. PDF is quite a sparse format in its syntax which limits overhead and much of it has repeating strings so is easily compressible.

If you want to see what the PDF contains in a tree-like view you can download the demo version of CosEdit.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文