C# 解码(解压缩)PDF 文件的 Deflate 数据
我想在 C# 中解压缩一些 DeflateCoded 数据(提取的 PDF)。 不幸的是,我每次都会遇到异常“解码时发现无效数据。”。 但数据是有效的。
private void Decompress()
{
FileStream fs = new FileStream(@"S:\Temp\myFile.bin", FileMode.Open);
//First two bytes are irrelevant
fs.ReadByte();
fs.ReadByte();
DeflateStream d_Stream = new DeflateStream(fs, CompressionMode.Decompress);
StreamToFile(d_Stream, @"S:\Temp\myFile1.txt", FileMode.OpenOrCreate);
d_Stream.Close();
fs.Close();
}
private static void StreamToFile(Stream inputStream, string outputFile, FileMode fileMode)
{
if (inputStream == null)
throw new ArgumentNullException("inputStream");
if (String.IsNullOrEmpty(outputFile))
throw new ArgumentException("Argument null or empty.", "outputFile");
using (FileStream outputStream = new FileStream(outputFile, fileMode, FileAccess.Write))
{
int cnt = 0;
const int LEN = 4096;
byte[] buffer = new byte[LEN];
while ((cnt = inputStream.Read(buffer, 0, LEN)) != 0)
outputStream.Write(buffer, 0, cnt);
}
}
有人有一些想法吗? 谢谢。
I would like to decompress in C# some DeflateCoded data (PDF extracted).
Unfortunately I got every time the exception "Found invalid data while decoding.".
But the data are valid.
private void Decompress()
{
FileStream fs = new FileStream(@"S:\Temp\myFile.bin", FileMode.Open);
//First two bytes are irrelevant
fs.ReadByte();
fs.ReadByte();
DeflateStream d_Stream = new DeflateStream(fs, CompressionMode.Decompress);
StreamToFile(d_Stream, @"S:\Temp\myFile1.txt", FileMode.OpenOrCreate);
d_Stream.Close();
fs.Close();
}
private static void StreamToFile(Stream inputStream, string outputFile, FileMode fileMode)
{
if (inputStream == null)
throw new ArgumentNullException("inputStream");
if (String.IsNullOrEmpty(outputFile))
throw new ArgumentException("Argument null or empty.", "outputFile");
using (FileStream outputStream = new FileStream(outputFile, fileMode, FileAccess.Write))
{
int cnt = 0;
const int LEN = 4096;
byte[] buffer = new byte[LEN];
while ((cnt = inputStream.Read(buffer, 0, LEN)) != 0)
outputStream.Write(buffer, 0, cnt);
}
}
Does anyone has some ideas?
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我为测试数据添加了这个:-
像这样修改解压缩:-
像这样运行它:-
没有错误。
我的结论是,前两个字节是相关的(显然它们与我的特定测试数据相关)或
说明你的数据有问题。
我们可以使用您的一些测试数据吗?
(如果敏感的话就不要这么做)
I added this for test data:-
Modified Decompress like this:-
Ran it like this:-
And got no errors.
I conclude that either the first two bytes are relevant (Obviously they are with my particular test data.) or
that your data has a problem.
Can we have some of your test data to play with?
(Obviously don't if it's sensitive)
感谢user159335和user1011394让我走上正轨!只需将流的所有字节传递给上述函数的输入即可。确保字节数与指定的长度相同。
Thank you user159335 and user1011394 for bringing me on the right track! Just pass all bytes of the stream to input of above function. Make sure the bytecount is the same as the length specified.
您所需要做的就是使用 GZip 而不是 Deflate。下面是我用于 PDF 文档中的 stream…endstream 部分内容的代码:
All you need to do is use GZip instead of Deflate. Below is the code I use for the content of the stream… endstream section in a PDF document:
对于我处理 PDF/A-3 文档中的附件的压缩问题,这些解决方案均无效。一些研究表明,.NET
DeflateStream
不支持按照 RFC1950 的带有标头和尾部的压缩流。供参考的错误消息:使用不支持的压缩方法压缩存档条目。
解决方案是使用替代库SharpZipLib
这是一个简单的方法,可以为我成功解码 PDF/A-3 文件中的 Deflate 附件:
None of the solutions worked for me on Deflate attachments in a PDF/A-3 document. Some research showed that .NET
DeflateStream
does not support compressed streams with a header and trailer as per RFC1950.Error message for reference: The archive entry was compressed using an unsupported compression method.
The solution is to use an alternative library SharpZipLib
Here is a simple method that successfully decoded a Deflate attachment from a PDF/A-3 file for me: