zLib inflate 在某些情况下结果为空
我的程序处理 PDF 文件并从中读取一些流。那里还有 FlateEncoded 流。我使用 zlib 的“inflate()”方法来解压缩它们。
这通常与以下代码配合得很好:
static string FlateDecode(string s){
int factor = 50;
z_stream stream;
while(true){
char * out = new char[s.length()*factor];
stream.zalloc = Z_NULL;
stream.zfree = Z_NULL;
stream.opaque = Z_NULL;
stream.avail_in = s.length();
stream.next_in = (Bytef*)s.c_str();
stream.avail_out = s.length()*factor;
stream.next_out = (Bytef*)out;
inflateInit(&stream);
inflate(&stream, Z_FINISH);
inflateEnd(&stream);
if(stream.total_out >= factor*s.length()){
delete[] out;
factor *= 2;
continue;
}
string result;
for(unsigned long i = 0; i < stream.total_out; i++){
result += out[i];
}
delete[] out;
return result;
}
}
但是对于某些流,inflate 的结果为空。这种情况并不常见,但确实会发生。有人知道为什么吗?
流必须正常,因为所有 PDF 阅读器都能正确读取 PDF 文件。
感谢您的帮助!
更新
我已经上传了 PDF 和流,以便您可以自己检查。
PDF ->流从字节 43296 开始
UPDATE 2
I将无法解压缩的流与可以解压缩的流进行比较。我注意到一件有趣的事情:工作流都以 2 字节 H% 开头。有问题的流以 ö> 开头。现在有人知道这意味着什么吗?
感谢您的帮助!
My program processes PDF files and reads some streams out of them. There are also FlateEncoded streams in there. I use the "inflate()" method of zlib to decompress them.
This usually works really well with the following code:
static string FlateDecode(string s){
int factor = 50;
z_stream stream;
while(true){
char * out = new char[s.length()*factor];
stream.zalloc = Z_NULL;
stream.zfree = Z_NULL;
stream.opaque = Z_NULL;
stream.avail_in = s.length();
stream.next_in = (Bytef*)s.c_str();
stream.avail_out = s.length()*factor;
stream.next_out = (Bytef*)out;
inflateInit(&stream);
inflate(&stream, Z_FINISH);
inflateEnd(&stream);
if(stream.total_out >= factor*s.length()){
delete[] out;
factor *= 2;
continue;
}
string result;
for(unsigned long i = 0; i < stream.total_out; i++){
result += out[i];
}
delete[] out;
return result;
}
}
But inflate has an empty result for some streams. It´s not often, but it happens. Has someone an idea why?
The streams must be ok because all PDF readers read the PDF files correctly.
Thanks for your help!
UPDATE
I've uploaded the PDF and the stream so you can check it by your own.
PDF -> The stream starts at byte 43296
UPDATE 2
I compared the streams that can´t be decompressed with the streams that can be decompressed. I've noticed an interesting thing: The working streams all begin with the 2 bytes H%. The problematic streams begin with ö>. Does anyone now what this means?
Thanks for any help!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您不应该在每次迭代时重新初始化流。在循环之前对其进行初始化,并在循环内调用
inflate()
,直到返回Z_OK
或Z_STREAM_END
。You shouldn't reinitialize the stream on each iteration. Initialize it before the loop and call
inflate()
inside the loop until it returns eitherZ_OK
orZ_STREAM_END
.zlib 似乎不支持 PDF 文件中的所有压缩流。
zlib simply seems to not support all deflated streams found in PDF files.