在 Java 中,从 URL 检索 JPEG 并将其转换为适合嵌入 RTF 文档的二进制或十六进制形式
我正在尝试用 Java 从头开始编写一个简单的 RTF 文档,并且我正在尝试在文档中嵌入 JPEG。下面是嵌入在 RTF 文档(由写字板生成,将 JPEG 转换为 WMF)中的 JPEG(由三个白色像素和左上角的一个黑色像素组成的 2x2 像素 JPEG)的示例:
{\pict\wmetafile8\picw53\pich53\picwgoal30\pichgoal30
0100090000036e00000000004500000000000400000003010800050000000b0200000000050000
000c0202000200030000001e000400000007010400040000000701040045000000410b2000cc00
020002000000000002000200000000002800000002000000020000000100040000000000000000
000000000000000000000000000000000000000000ffffff00fefefe0000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000
0000001202af0801010000040000002701ffff030000000000
}
我一直在阅读 RTF 规范,看起来您可以指定图像是 JPEG,但由于写字板总是将图像转换为 WMF,所以我看不到嵌入 JPEG 的示例。所以我最终可能还需要从 JPEG 转码为 WMF 或其他东西......
但基本上,我正在寻找如何生成二进制或十六进制(规范,第 148 页:“这些图片可以是十六进制的(默认)或二进制格式。”)给定文件 URL 的 JPEG 形式。
谢谢!
编辑:我认为流的东西工作得很好,但仍然不明白如何对其进行编码,因为无论我在做什么,它都不是 RTF 可读的。例如,上面的图片显示为:
ffd8ffe00104a464946011106006000ffdb0430211211222222223533333644357677767789b988a877adaabcccc79efdcebcccffdb04312223336336c878ccccccccccccccccccccccccccccccccccccccccccccccccccffc0011802023122021113111ffc401f001511111100000000123456789abffc40b5100213324355440017d123041151221314161351617227114328191a182342b1c11552d1f024336272829a161718191a25262728292a3435363738393a434445464748494a535455565758595a636465666768696a737475767778797a838485868788898a92939495969798999aa2a3a4a5a6a7a8a9aab2b3b4b5b6b7b8b9bac2c3c4c5c6c7c8c9cad2d3d4d5d6d7d8d9dae1e2e3e4e5e6e7e8e9eaf1f2f3f4f5f6f7f8f9faffc401f103111111111000000123456789abffc40b51102124434754401277012311452131612415176171132232818144291a1b1c19233352f0156272d1a162434e125f11718191a262728292a35363738393a434445464748494a535455565758595a636465666768696a737475767778797a82838485868788898a92939495969798999aa2a3a4a5a6a7a8a9aab2b3b4b5b6b7b8b9bac2c3c4c5c6c7c8c9cad2d3d4d5d6d7d8d9dae2e3e4e5e6e7e8e9eaf2f3f4f5f6f7f8f9faffda0c31021131103f0fdecf09f84f4af178574cd0b42d334fd1744d16d22bd3f4fb0b74b6b5bb78902450c512091c688aaaa8a0500014514507ffd9
这个 PHP 库 可以解决问题,所以我正在尝试将相关部分移植到Java。 。
$imageData = file_get_contents($this->_file);
$size = filesize($this->_file);
$hexString = '';
for ($i = 0; $i < $size; $i++) {
$hex = dechex(ord($imageData{$i}));
if (strlen($hex) == 1) {
$hex = '0' . $hex;
}
$hexString .= $hex;
}
return $hexString;
但我不知道 dechex(ord($imageData{$i}))
的 Java 类似物是什么 :( 我只得到了 Integer.toHexString()
函数,它负责 dechex
部分......
谢谢大家。:)
I'm trying to write a simple RTF document pretty much from scratch in Java, and I'm trying to embed JPEGs in the document. Here's an example of a JPEG (a 2x2-pixel JPEG consisting of three white pixels and a black pixel in the upper left, if you're curious) embedded in an RTF document (generated by WordPad, which converted the JPEG to WMF):
{\pict\wmetafile8\picw53\pich53\picwgoal30\pichgoal30
0100090000036e00000000004500000000000400000003010800050000000b0200000000050000
000c0202000200030000001e000400000007010400040000000701040045000000410b2000cc00
020002000000000002000200000000002800000002000000020000000100040000000000000000
000000000000000000000000000000000000000000ffffff00fefefe0000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000000000000
0000001202af0801010000040000002701ffff030000000000
}
I've been reading the RTF specification, and it looks like you can specify that the image is a JPEG, but since WordPad always converts images to WMF, I can't see an example of an embedded JPEG. So I may also end up needing to transcode from JPEG to WMF or something....
But basically, I'm looking for how to generate the binary or hexadecimal (Spec, p.148: "These pictures can be in hexadecimal (the default) or binary format.") form of a JPEG given a file URL.
Thanks!
EDIT: I have the stream stuff working all right, I think, but still don't understand exactly how to encode it, because whatever I'm doing, it's not RTF-readable. E.g., the above picture instead comes out as:
ffd8ffe00104a464946011106006000ffdb0430211211222222223533333644357677767789b988a877adaabcccc79efdcebcccffdb04312223336336c878ccccccccccccccccccccccccccccccccccccccccccccccccccffc0011802023122021113111ffc401f001511111100000000123456789abffc40b5100213324355440017d123041151221314161351617227114328191a182342b1c11552d1f024336272829a161718191a25262728292a3435363738393a434445464748494a535455565758595a636465666768696a737475767778797a838485868788898a92939495969798999aa2a3a4a5a6a7a8a9aab2b3b4b5b6b7b8b9bac2c3c4c5c6c7c8c9cad2d3d4d5d6d7d8d9dae1e2e3e4e5e6e7e8e9eaf1f2f3f4f5f6f7f8f9faffc401f103111111111000000123456789abffc40b51102124434754401277012311452131612415176171132232818144291a1b1c19233352f0156272d1a162434e125f11718191a262728292a35363738393a434445464748494a535455565758595a636465666768696a737475767778797a82838485868788898a92939495969798999aa2a3a4a5a6a7a8a9aab2b3b4b5b6b7b8b9bac2c3c4c5c6c7c8c9cad2d3d4d5d6d7d8d9dae2e3e4e5e6e7e8e9eaf2f3f4f5f6f7f8f9faffda0c31021131103f0fdecf09f84f4af178574cd0b42d334fd1744d16d22bd3f4fb0b74b6b5bb78902450c512091c688aaaa8a0500014514507ffd9
This PHP library would do the trick, so I'm trying to port the relevant portion to Java. Here is is:
$imageData = file_get_contents($this->_file);
$size = filesize($this->_file);
$hexString = '';
for ($i = 0; $i < $size; $i++) {
$hex = dechex(ord($imageData{$i}));
if (strlen($hex) == 1) {
$hex = '0' . $hex;
}
$hexString .= $hex;
}
return $hexString;
But I don't know what the Java analogue to dechex(ord($imageData{$i}))
is. :( I got only as far as the Integer.toHexString()
function, which takes care of the dechex
part....
Thanks all. :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
给定任何文件的文件 URL,您可以通过执行以下操作来获取相应的字节(为简洁起见,省略了异常处理)...
我现在正在查看您的 PHP 代码片段,并意识到 RTF 是一个奇怪的规范!看起来图像的每个字节都被编码为 2 个十六进制数字(这会无缘无故地将图像的大小加倍)。整个内容都以原始 ASCII 编码存储。所以,你会想做...
编辑:更新代码示例以在十六进制字节上填充 0
编辑:负字节在转换为整数时在逻辑上右移 >_<
Given a file URL for any file you can get the corresponding bytes by doing (exception handling omitted for brevity)...
I'm looking at your PHP snippet now and realizing that RTF is a bizarre specification! It looks like each byte of the image is encoded as 2 hex digits (which doubles the size of the image for no apparent reason). The the entire thing is stored in raw ASCII encoding. So, you'll want to do...
EDIT: Updated code sample to pad 0's on the hex bytes
EDIT: negative bytes were getting logically right shifted when converted to ints >_<
https://joseluisbz.wordpress.com/2013/ 07/26/exploring-a-wmf-file-0x000900/
也许可以帮助你:
此代码采用一个字符串中的表示形式,并将结果存储在一个文件中。
https://joseluisbz。 wordpress.com/2011/06/22/script-de-clases-rtf-para-jsp-y-php/
现在如果你想获取图像文件的表示,你可以使用这个:
https://joseluisbz.wordpress.com/2013/07/26/exploring-a-wmf-file-0x000900/
Maybe help you this:
This code take the representation in one string, and result is stored in a file.
https://joseluisbz.wordpress.com/2011/06/22/script-de-clases-rtf-para-jsp-y-php/
Now if you want to obtain the representation of the image file, you can use this: