使用LZS算法解压缩文件文件

发布于 2025-01-28 07:07:44 字数 787 浏览 6 评论 0 原文

我正在使用此算法对我的文档文件进行解压缩,但是它仅解压缩部分文件数据,而不是完全解压缩的任何人都可以帮助我解压缩此文件。数据采用十六进制格式。

在此处输入图像描述 我上传的图像是单个图像文件的BKF文件的屏幕截图。我知道共享屏幕截图不好,但是您可以通过查看我必须从哪个部分开始裁剪的数据来提供帮助,以便文件图像数据不是损坏的格式。因为根据我的说法,压缩数据是从18个字节开始的。因为当不压缩某些文件数据时,数据的启动是从FH块中启动。 http://laytongraphics.com/mtf/mtf/mtf_100a.pdf -71解释压缩数据的格式,在该页面NO-109-110之后是解释算法,但我找不到任何链接。如果您了解我的查询,请帮助我。

https://github.com/sgherro/Exercises-cpp/blob/89bbd78eeac9666ed20f083ebf116e693a8c23ce/Lempel-Ziv-Stac/main.cpp
I am using this algorithm to decompress my doc file but it only decompress the partial file data not fully decompress can anyone help me to decompress this file. the data is in hex format.


enter image description here
The image I upload is screen shot of a bkf file for a single image file. I know sharing screenshot does not good but can you help by seeing this data from which section I have to start CROP so the file image data is not corrupt format. Because according to me compressed data is start from after 18 byte of FH. because when some file data is not compressed then starting of data is from FH block. http://laytongraphics.com/mtf/MTF_100a.PDF in this link page no-70-71 explain format of compressed data, after that page no- 109-110 is explain algo but I cannot find it any link for that. If you understand my query please help me.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

无风消散 2025-02-04 07:07:44

压缩数据以LEMPEL-ZIV-STAC格式进行,在许多地方进行了记录,包括。这是一种非常简单的压缩数据格式。这是可以解压缩该格式的代码:

/*
  unlzs version 1.0, 16 May 2022

  Copyright (C) 2022 Mark Adler

  This software is provided 'as-is', without any express or implied warranty.
  In no event will the authors be held liable for any damages arising from the
  use of this software.

  Permission is granted to anyone to use this software for any purpose,
  including commercial applications, and to alter it and redistribute it
  freely, subject to the following restrictions:

  1. The origin of this software must not be misrepresented; you must not claim
     that you wrote the original software. If you use this software in a
     product, an acknowledgment in the product documentation would be
     appreciated but is not required.
  2. Altered source versions must be plainly marked as such, and must not be
     misrepresented as being the original software.
  3. This notice may not be removed or altered from any source distribution.

  Mark Adler
  [email protected]
 */

// Decompress LZS (Lempel-Ziv-Stac) compressed data. See RFC 1967 for the
// compressed data format description, or ANSI X3.241-1994 for the official
// standard.

#include <stdio.h>
#include <setjmp.h>

// Bitstream for pulling sequences of bits from the input.
typedef struct {
    int bits;           // number of bits in the buffer
    long buf;           // bit buffer, with least significant being newest
    FILE *in;           // input file
    jmp_buf bad;        // longjmp state for use if end of input reached
} bitstream_t;

// Return bits bits from the stream bs. Long jump if EOF reached before bits
// bits can be pulled. This can support up to 25 bits requested, for a 32-bit
// long. (The most requested here is 11.)
static inline long get(bitstream_t *bs, int bits) {
    while (bs->bits < bits) {
        int ch = getc(bs->in);
        if (ch == EOF)
            longjmp(bs->bad, 1);
        bs->buf = (bs->buf << 8) + ch;
        bs->bits += 8;
    }
    bs->bits -= bits;
    return (bs->buf >> bs->bits) & ((1L << bits) - 1);
}

// Sliding window for matching byte patterns to copy. This is a circular buffer
// of the last 2K bytes of uncompressed data.
#define WSIZE 2048                  // LZS window size
typedef struct {
    unsigned next;                  // index of next location in window[]
    int full;                       // true if the window is full
    FILE *out;                      // file to write output to
    unsigned char window[WSIZE];    // sliding window
} window_t;

// Write one byte to the output, updating the sliding window.
static inline void put(window_t *win, unsigned octet) {
    win->window[win->next++] = octet;
    if (win->next == sizeof(win->window)) {
        win->next = 0;
        win->full = 1;
    }
    putc(octet, win->out);
}

// Decompress LZS from in to out. Return 0 on success, 1 if the input ends
// prematurely, 2 if a distance extends past the start of the input (like
// trying to look back in time before the Big Bang), 3 for an invalid distance
// code (using 11 bits for a distance when 7 bits would suffice -- this is a
// constraint from a shall statement in the ANSI standard), or 4 if the final
// padding bits are not zero (also a shall statement). A distance past the
// start of input error is the most common error encountered on random input,
// by far. That error likely indicates that the input is not LZS-compressed
// data in the first place.
static int unlzs(FILE *in, FILE *out) {
    bitstream_t bs = {0, 0, in, {0}};
    if (setjmp(bs.bad))
        return 1;                   // premature end of input
    window_t win = {0, 0, out, {0}};
    for (;;) {
        if (get(&bs, 1)) {
            // Get distance, length pair.
            unsigned num = get(&bs, 1) ? 7 : 11;
            unsigned dist = get(&bs, num);
            if (num == 11 && dist < 128)
                return 3;           // invalid distance code
            if (dist == 0) {
                // End marker.
                if (get(&bs, bs.bits) != 0)
                    return 4;       // padding bits not zero
                return 0;           // clean end of input
            }
            unsigned base = 2;
            unsigned len = get(&bs, 2);
            if (len == 3) {
                base += len;
                len = get(&bs, 2);
                if (len == 3) {
                    base += len;
                    while ((len = get(&bs, 4)) == 15)
                        base += len;
                }
            }
            len += base;

            // Copy len bytes from dist back in the window.
            if (!win.full && dist > win.next)
                return 2;           // distance too far
            do {
                put(&win, win.window[(win.next - dist) & (WSIZE - 1)]);
            } while (--len);
        }
        else
            // Get and write a literal.
            put(&win, get(&bs, 8));
    }
}

// Return the number of bytes left to be read from in.
static unsigned long left(FILE *in) {
    unsigned long n = 0;
    while (getc(in) != EOF)
        n++;
    return n;
}

// Decompress LZS from stdin to stdout.
int main(void) {
    int ret = unlzs(stdin, stdout);
    unsigned long n = left(stdin);
    if (ret == 1)
        fputs("premature end of input\n", stderr);
    else if (ret == 2)
        fprintf(stderr, "distance too far (%lu left)\n", n);
    else if (ret == 3)
        fprintf(stderr, "invalid distance code (%lu left)\n", n);
    else if (ret == 4)
        fprintf(stderr, "padding bits at end not zero (%lu left)\n", n);
    else if (n)
        fprintf(stderr, "%lu bytes of junk after compressed data\n", n);
    return ret;
}

The compressed data is in the Lempel-Ziv-Stac format, documented in many places, including RFC 1967. It is a very simple compressed data format. Here is code that can decompress that format:

/*
  unlzs version 1.0, 16 May 2022

  Copyright (C) 2022 Mark Adler

  This software is provided 'as-is', without any express or implied warranty.
  In no event will the authors be held liable for any damages arising from the
  use of this software.

  Permission is granted to anyone to use this software for any purpose,
  including commercial applications, and to alter it and redistribute it
  freely, subject to the following restrictions:

  1. The origin of this software must not be misrepresented; you must not claim
     that you wrote the original software. If you use this software in a
     product, an acknowledgment in the product documentation would be
     appreciated but is not required.
  2. Altered source versions must be plainly marked as such, and must not be
     misrepresented as being the original software.
  3. This notice may not be removed or altered from any source distribution.

  Mark Adler
  [email protected]
 */

// Decompress LZS (Lempel-Ziv-Stac) compressed data. See RFC 1967 for the
// compressed data format description, or ANSI X3.241-1994 for the official
// standard.

#include <stdio.h>
#include <setjmp.h>

// Bitstream for pulling sequences of bits from the input.
typedef struct {
    int bits;           // number of bits in the buffer
    long buf;           // bit buffer, with least significant being newest
    FILE *in;           // input file
    jmp_buf bad;        // longjmp state for use if end of input reached
} bitstream_t;

// Return bits bits from the stream bs. Long jump if EOF reached before bits
// bits can be pulled. This can support up to 25 bits requested, for a 32-bit
// long. (The most requested here is 11.)
static inline long get(bitstream_t *bs, int bits) {
    while (bs->bits < bits) {
        int ch = getc(bs->in);
        if (ch == EOF)
            longjmp(bs->bad, 1);
        bs->buf = (bs->buf << 8) + ch;
        bs->bits += 8;
    }
    bs->bits -= bits;
    return (bs->buf >> bs->bits) & ((1L << bits) - 1);
}

// Sliding window for matching byte patterns to copy. This is a circular buffer
// of the last 2K bytes of uncompressed data.
#define WSIZE 2048                  // LZS window size
typedef struct {
    unsigned next;                  // index of next location in window[]
    int full;                       // true if the window is full
    FILE *out;                      // file to write output to
    unsigned char window[WSIZE];    // sliding window
} window_t;

// Write one byte to the output, updating the sliding window.
static inline void put(window_t *win, unsigned octet) {
    win->window[win->next++] = octet;
    if (win->next == sizeof(win->window)) {
        win->next = 0;
        win->full = 1;
    }
    putc(octet, win->out);
}

// Decompress LZS from in to out. Return 0 on success, 1 if the input ends
// prematurely, 2 if a distance extends past the start of the input (like
// trying to look back in time before the Big Bang), 3 for an invalid distance
// code (using 11 bits for a distance when 7 bits would suffice -- this is a
// constraint from a shall statement in the ANSI standard), or 4 if the final
// padding bits are not zero (also a shall statement). A distance past the
// start of input error is the most common error encountered on random input,
// by far. That error likely indicates that the input is not LZS-compressed
// data in the first place.
static int unlzs(FILE *in, FILE *out) {
    bitstream_t bs = {0, 0, in, {0}};
    if (setjmp(bs.bad))
        return 1;                   // premature end of input
    window_t win = {0, 0, out, {0}};
    for (;;) {
        if (get(&bs, 1)) {
            // Get distance, length pair.
            unsigned num = get(&bs, 1) ? 7 : 11;
            unsigned dist = get(&bs, num);
            if (num == 11 && dist < 128)
                return 3;           // invalid distance code
            if (dist == 0) {
                // End marker.
                if (get(&bs, bs.bits) != 0)
                    return 4;       // padding bits not zero
                return 0;           // clean end of input
            }
            unsigned base = 2;
            unsigned len = get(&bs, 2);
            if (len == 3) {
                base += len;
                len = get(&bs, 2);
                if (len == 3) {
                    base += len;
                    while ((len = get(&bs, 4)) == 15)
                        base += len;
                }
            }
            len += base;

            // Copy len bytes from dist back in the window.
            if (!win.full && dist > win.next)
                return 2;           // distance too far
            do {
                put(&win, win.window[(win.next - dist) & (WSIZE - 1)]);
            } while (--len);
        }
        else
            // Get and write a literal.
            put(&win, get(&bs, 8));
    }
}

// Return the number of bytes left to be read from in.
static unsigned long left(FILE *in) {
    unsigned long n = 0;
    while (getc(in) != EOF)
        n++;
    return n;
}

// Decompress LZS from stdin to stdout.
int main(void) {
    int ret = unlzs(stdin, stdout);
    unsigned long n = left(stdin);
    if (ret == 1)
        fputs("premature end of input\n", stderr);
    else if (ret == 2)
        fprintf(stderr, "distance too far (%lu left)\n", n);
    else if (ret == 3)
        fprintf(stderr, "invalid distance code (%lu left)\n", n);
    else if (ret == 4)
        fprintf(stderr, "padding bits at end not zero (%lu left)\n", n);
    else if (n)
        fprintf(stderr, "%lu bytes of junk after compressed data\n", n);
    return ret;
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文