base64 解码的文件不等于原始未编码的文件

发布于 2024-12-28 23:55:13 字数 1884 浏览 5 评论 0原文

我有一个普通的pdf文件 A.pdf ，第三方以base64对该文件进行编码，并将其作为长字符串在网络服务中发送给我（我无法控制第三方）。

我的问题是，当我使用 java org.apache.commons.codec.binary.Base64 解码字符串并将输出正确到名为 B.pdf 的文件时我希望 B.pdf 与 A.pdf 相同，但 B.pdf 结果与 A.pdf 略有不同。因此，Acrobat 无法将 B.pdf 识别为有效的 pdf。

base64 是否有不同类型的编码\字符集机制？我可以检测我收到的字符串是如何编码的，以便 B.pdf=A.pdf 吗？

编辑-这是我想要解码的文件，解码后应以 pdf 格式打开

我的编码文件< /a>

这是在记事本++中打开的文件的标题，

**A.pdf**
        %PDF-1.4
        %±²³´
        %Created by Wnv/EP PDF Tools v6.1
        1 0 obj
        <<
        /PageMode /UseNone
        /ViewerPreferences 2 0 R
        /Type /Catalog

  **B.pdf**
        %PDF-1.4
        %±²³´
        %Created by Wnv/EP PDF Tools v6.1
        1 0! bj
        <<
        /PageMode /UseNone
        /ViewerPreferences 2 0 R
        /]
        pe /Catalog

这就是我解码字符串的方式

private static void decodeStringToFile(String encodedInputStr,
            String outputFileName) throws IOException {
        BufferedReader in = null;
        BufferedOutputStream out = null;
        try {
            in = new BufferedReader(new StringReader(encodedInputStr));
        out = new BufferedOutputStream(new FileOutputStream(outputFileName));
            decodeStream(in, out);
            out.flush();
        } finally {
            if (in != null)
                in.close();
            if (out != null)
                out.close();
        }
    }

    private static void decodeStream(BufferedReader in, OutputStream out)
            throws IOException {
        while (true) {
            String s = in.readLine();
            if (s == null)
                break;
            //System.out.println(s);
            byte[] buf = Base64.decodeBase64(s);
            out.write(buf);
        }

    }

原文

I have a normal pdf file A.pdf , a third party encodes the file in base64 and sends it to me in a webservice as a long string (i have no control on the third party).

My problem is that when i decode the string with java org.apache.commons.codec.binary.Base64 and right the output to a file called B.pdf
I expect B.pdf to be identical to A.pdf, but B.pdf turns out a little different then A.pdf. As a result B.pdf is not recognized as a valid pdf by acrobat.

Does base64 have different types of encoding\charset mechanisms? can i detect how the string I received is encoded so that B.pdf=A.pdf ?

EDIT- this is the file I want to decode, after decoding it should open as a pdf

my encoded file

this is the header of the files opened in notepad++

**A.pdf**
        %PDF-1.4
        %±²³´
        %Created by Wnv/EP PDF Tools v6.1
        1 0 obj
        <<
        /PageMode /UseNone
        /ViewerPreferences 2 0 R
        /Type /Catalog

  **B.pdf**
        %PDF-1.4
        %±²³´
        %Created by Wnv/EP PDF Tools v6.1
        1 0! bj
        <<
        /PageMode /UseNone
        /ViewerPreferences 2 0 R
        /]
        pe /Catalog

this is how I decode the string

private static void decodeStringToFile(String encodedInputStr,
            String outputFileName) throws IOException {
        BufferedReader in = null;
        BufferedOutputStream out = null;
        try {
            in = new BufferedReader(new StringReader(encodedInputStr));
        out = new BufferedOutputStream(new FileOutputStream(outputFileName));
            decodeStream(in, out);
            out.flush();
        } finally {
            if (in != null)
                in.close();
            if (out != null)
                out.close();
        }
    }

    private static void decodeStream(BufferedReader in, OutputStream out)
            throws IOException {
        while (true) {
            String s = in.readLine();
            if (s == null)
                break;
            //System.out.println(s);
            byte[] buf = Base64.decodeBase64(s);
            out.write(buf);
        }

    }

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

阳光下慵懒的猫 2025-01-04 23:55:13

您正在通过逐行工作来破坏解码。 Base64 解码器只是忽略空格，这意味着原始内容中的一个字节很可能被分解为两行 Base64 文本。您应该将所有行连接在一起并一次性解码文件。
向 Base64 类方法提供内容时，优先使用 byte[] 而不是 String。 String 意味着字符集编码，这可能不会达到您想要的效果。