如何减小嵌入图像的 RTF 的大小?
我们有一些代码可以从 RTF 模板生成 RTF 文档。它基本上是在 RTF 文件中进行字符串搜索和特殊标记的替换。这可以通过网页访问。
通常,处理时间非常快。
但是,我们需要在模板中嵌入图像。我们一直使用 Word 的“插入/图片/来自文件...”功能将这些图像嵌入为 JPEG 图像。但我们发现生成的 RTF 文件大小在很大程度上取决于图像。
例如,我插入了一个 20k JPEG 徽标(基本上是带有一些文本的纯色背景)。 RTF 文件的大小从大约 390k(不带图像)增加到 510k(带图像)。
然后我们插入一个包含屏幕截图的JPEG,即图像包含文本、多种颜色等。JPEG 大约为150k。使用此图像,RTF 文件的大小从 390k 增加到 3.5MB。
因此,Word 用于将图像存储到 RTF 中的编码并不是线性执行的。我猜这取决于 JPEG 图像中的内容。
我需要将 RTF 模板的大小保持在最低限度,以尽量缩短文件处理时间。
- 有谁知道如何最小化带有嵌入图像的 RTF 文件的大小吗?
- 有什么方法可以控制Word使用的编码吗?我在任何地方都看不到任何选项。
- 有谁知道Word/RTF 使用什么类型的二进制编码?
提前致谢。
We have some code which produces an RTF document from a RTF template. It is basically doing string search and replaces of special tags within the RTF file. This is accessible via a web page.
Typically, the processing time for this is really quick.
However, we need to embed an image within a template. We've been embedding these as JPEG images using Word's "Insert/Picture/From File..." functionality. But we've found that the resultant RTF file size is massively dependant upon the image.
For example, I've inserted a 20k JPEG logo (which is basically a solid background with some text). The RTF file increased in size from around 390k (without the image) to 510k (with the image).
Then we inserted a JPEG containing a screenshot, i.e. the image contains text, multiple colours, etc. The JPEG is around 150k. Using this image, the RTF file increased in size from 390k to 3.5MB.
So the encoding that Word uses for storing images into an RTF doesn't perform linearly. I'm guessing it is dependant upon what is in the JPEG image.
I need to keep the size of the RTF templates to a minimum to try and keep our file processing times to a minimum.
- Does anyone have any ideas on how to minimize the size of the RTF files with embedded images?
- Is there any way of controlling the encoding that Word uses? I can't see any options anywhere.
- Does anyone know what type of binary encoding Word/RTF uses?
Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
这是最佳解决方案
http://support.microsoft.com/kb/224663
摘录:
Here is the best solution
http://support.microsoft.com/kb/224663
Excerpt:
RTF 文件中的图像以未压缩的 WMF 形式存储。在 Mac 上,它是 macpict。减小文件大小的最佳方法是将图像链接到文档,而不是在文档中插入副本。代价是您必须将文件保存在一起。
编辑
压缩 RTF 是一种选择吗?使用 zip/rar,您将恢复文件大小,但显然首先必须解压缩。应该有可以进行rtf压缩的工具,但我从未使用过它们。
An image in an RTF file gets stored as a WMF, uncompressed. On mac, it it would be macpict. Your best bet to keep the file size down is to link the image to the document rather than insert a copy in the document. The trade-off is that you have to keep the files together.
EDIT
Is compressing the RTF an option? Using zip/rar, you'll get your file size back, but you'll have to uncompress, first obviously. There are supposed to be tools that will do rtf compression, but I have never used them.
我们在工作中做过一个类似的项目。只是我们没有使用“插入/图片/来自文件...”功能。我们的模板有一个名为 [photos] 的标签,我想您自己的模板也有。当我们处理文档时,我们将标签替换为显示图像所需的 RTF 代码。我们将它们放在一个表格中,每行显示两个图像,加上顶部的一行作为标题。
因此,您可以在模板中放置一个标签[照片]。然后用 RTF 代码替换标签。您可以在网上找到这些代码的一些很好的参考。例如。 此处
。
现在,我的代码看起来像这样:
如果您将图像放入字节数组中,则可以使用 BitConverter.ToString(array) 来获取十六进制代码。只需要将破折号“-”替换为“”;
我们的文件占用的空间不到“普通”RTF 空间的 1/10。如果我们用Notepad++等编辑器打开文档的代码,我们可以看到RTF代码,但如果我们打开文档并将其另存为RTF(更改名称),它就会从1.5Mb变成50Mb!
我猜 DaveParillo 的回复证明了这一点:我只将每个图像写一次。
希望有帮助。
干杯伙计
We have done a similar project over at work. Only we're not using that "Insert/Picture/From File..." functionality. Our template has a tag named [photos], as I presume your own does also. When we process the document we replace the tag with the RTF codes needed to display images. We're putting them within a table and we're displaying two images on each row, plus a row on top for the title.
So, you might place a tag [photos] in your template. Then you replace the tag with the RTF Codes. You can find some good references to these codes on the web. For eg. here
.
Now, my code looks something like this:
if you get your image into a byte array, you may use BitConverter.ToString(array) to get your hex code. only you'll need to replace dashes "-" by "";
Our files will take up less than 1/10th of the space a "normal" RTF will. If we open the doc's code with an editor such as Notepad++, we can see the RTF codes, but if we open the document and save it as RTF (changing its name), it'll go from 1.5Mb to 50Mb!!
I'm guessing DaveParillo's reply justifies it: I'm only writing each image once.
Hope it helps.
Cheers mate
首先,请记住每个字节使用 2 个字符(两个字节)存储,这意味着增量至少是原始图片大小的两倍。
您需要的其他东西是 Word 和 Word Pad 插入相同图像的不同(风格或格式)以及其他字段(RTF 可以在没有它们的情况下显示)。
以下是一些用于在 RTF 中插入图像的脚本 (https://joseluisbz.wordpress.com/2011/06/22/script-de-clases-rtf-para-jsp-y-php/),以及一个使用示例(< a href="https://joseluisbz.wordpress.com/2011/07/16/subiendo-imagenes-png-y-jpg-y-archivos-a-mysql-con-php-y-jsp-y-mostrarlos- en-rtf-usando-clases/" rel="nofollow">https://joseluisbz.wordpress.com/2011/07/16/subiendo-imagenes-png-y-jpg-y-archivos-a-mysql-con -php-y-jsp-y-mostrarlos-en-rtf-usando-clases/)
现在,也许您需要将原始图像替换为另一个图像(http://joseluisbz.wordpress.com/2013/07/26/exploring-a-wmf-file-0x000900 /)。
Initially, keep in mind that each byte is stored using 2 characters (two bytes), this means that the increments at least is the double size of original picture.
Other things that you need is that Word and Word Pad insert different (flavor or format) of the same image plus other fields (that RTF can to be displayed without them).
Here are some scripts used to insert images in RTF (https://joseluisbz.wordpress.com/2011/06/22/script-de-clases-rtf-para-jsp-y-php/), and one example of use (https://joseluisbz.wordpress.com/2011/07/16/subiendo-imagenes-png-y-jpg-y-archivos-a-mysql-con-php-y-jsp-y-mostrarlos-en-rtf-usando-clases/)
Now, maybe you will need replace the original Image with another (http://joseluisbz.wordpress.com/2013/07/26/exploring-a-wmf-file-0x000900/).
Swartbees 的答案对我来说非常有效。我首先使用 GIMP Save as jpeg 功能将图像质量降低到“0”。在遵循上面 Swartbees 建议的微软解决方案后,我将图片重新插入到文件中,大小增加可以忽略不计,从 229k 到 279k(而不是 29000kb)。
谢谢你们的建议。
The Swartbees answer worked perfectly for me. I first reduced the image quality to "0" using G.I.M.P. Save as jpeg functionality. After following the microsoft solution suggested by Swartbees above I reinserted the picture into the file and the size increase was negligible 229k to 279k (as opposed to 29000kb).
Thanks for your suggestions guys.
是的,通过删除多余的字符。为此,您必须将它们插回您的流中。
例如,如果一行中有超过 20 个 f 字符,那么您可以在流中替换为 f[20]。这是一个开始。
-祝你好运。
Yes, by removing the redundant characters. And to do this you must insert them back into your stream.
For instance if you have over twenty f characters in one line, then you can replace with f[20] in your stream. It is a start.
-Best of luck.