与“随机”混合的base64编码数据十六进制

发布于 2024-08-12 10:18:47 字数 1429 浏览 2 评论 0原文

我得到一个输入字符串,其中包含一些经过 Base64 编码的数据。不幸的是,它得到了随机的十六进制数据(全部小写)混合它。手动排序相当简单,因为十六进制数据似乎都是 32 字节的段。例如,我可以格式化一个示例字符串,如下所示:

    6dd11d15c419ac219901f14bdd999f38
    0ad94e978ad624d15189f5230e5435a9
    2dc19fe95e583e7d593dd52ae7e68a6e
    465ffa6074a371a8958dad3ad271181a
    23310939b981b4e56f2ecee26f82ec60
    fe04bef49be47603d1278cc80673b226

    VGhpcyBpcyBzb

    6dd11d15c419ac219901f14bdd999f38
    0ad94e978ad624d15189f5230e5435a9
    2dc19fe95e583e7d593dd52ae7e68a6e
    465ffa6074a371a8958dad3ad271181a
    23310939b981b4e56f2ecee26f82ec60
    fe04bef49be47603d1278cc80673b226
    6dd11d15c419ac219901f14bdd999f38
    0ad94e978ad624d15189f5230e5435a9
    2dc19fe95e583e7d593dd52ae7e68a6e
    465ffa6074a371a8958dad3ad271181a
    23310939b981b4e56f2ecee26f82ec60
    fe04bef49be47603d1278cc80673b226

    21lIGJhc2UtNjQ

    bb4af7e61760735ba17c29e8f542a668
    75da91e90863f1ddb7e149297fc59afc
    f5de951fb65d06d2927aab7b9b54830e
    2d935616a54c381c2f38db3731d5a378

    gZW5jb2RlZCB

    6dd11d15c419ac219901f14bdd999f38
    0ad94e978ad624d15189f5230e5435a9
    2dc19fe95e583e7d593dd52ae7e68a6e
    465ffa6074a371a8958dad3ad271181a
    23310939b981b4e56f2ecee26f82ec60
    fe04bef49be47603d1278cc80673b226

    kYXRhIGhvb3JheSE=

基本上,我需要获取 Base64 内容并对其进行解码(在 PHP 中)。问题是,我将所有内容都作为一个长字符串来获取,并且并不总是立即显而易见在哪里放置换行符。例如,base64 内容的第一位以“b”结尾,很容易被误认为是某些十六进制数据。我对如何做到这一点感到不知所措......有什么想法吗?

谢谢!
-马拉

I get an input string with some data that's base64 encoded. Unfortunately, it gets random hexadecimal data (all lowercase) mixed it. It's fairly straightforward to sort out by hand because the hexadecimal data all seems to be in segments of 32 bytes. For example, I can format an example string like this:

    6dd11d15c419ac219901f14bdd999f38
    0ad94e978ad624d15189f5230e5435a9
    2dc19fe95e583e7d593dd52ae7e68a6e
    465ffa6074a371a8958dad3ad271181a
    23310939b981b4e56f2ecee26f82ec60
    fe04bef49be47603d1278cc80673b226

    VGhpcyBpcyBzb

    6dd11d15c419ac219901f14bdd999f38
    0ad94e978ad624d15189f5230e5435a9
    2dc19fe95e583e7d593dd52ae7e68a6e
    465ffa6074a371a8958dad3ad271181a
    23310939b981b4e56f2ecee26f82ec60
    fe04bef49be47603d1278cc80673b226
    6dd11d15c419ac219901f14bdd999f38
    0ad94e978ad624d15189f5230e5435a9
    2dc19fe95e583e7d593dd52ae7e68a6e
    465ffa6074a371a8958dad3ad271181a
    23310939b981b4e56f2ecee26f82ec60
    fe04bef49be47603d1278cc80673b226

    21lIGJhc2UtNjQ

    bb4af7e61760735ba17c29e8f542a668
    75da91e90863f1ddb7e149297fc59afc
    f5de951fb65d06d2927aab7b9b54830e
    2d935616a54c381c2f38db3731d5a378

    gZW5jb2RlZCB

    6dd11d15c419ac219901f14bdd999f38
    0ad94e978ad624d15189f5230e5435a9
    2dc19fe95e583e7d593dd52ae7e68a6e
    465ffa6074a371a8958dad3ad271181a
    23310939b981b4e56f2ecee26f82ec60
    fe04bef49be47603d1278cc80673b226

    kYXRhIGhvb3JheSE=

Basically, I need to get the base64 stuff out and decode it (in PHP). The catch is that I get it all as one long string and it's not always immediately obvious where to put the linebreaks. For example, the first bit of base64 stuff ends in 'b', easily mistaken for some of the hex data. I'm at something of a loss for how to do this... Any ideas?

Thanks!
-mala

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

故人如初 2024-08-19 10:18:47

我认为这是一个无法回答的问题——完全有可能有 32 字节的 Base64 编码数据无法与 32 字节的随机十六进制区分开。如果没有有关流的更多信息,就不可能决定这些数据可能进入哪个存储桶。

I think this is an unanswerable problem -- it is entirely possible to have 32 bytes worth of base64-encoded data that cannot be differentiated from 32 bytes of random hex. Without more information about the stream it would be impossible to make a decision as to which bucket such data might go.

撩发小公举 2024-08-19 10:18:47

您可以这样做:

read these 32 characters - if( preg_match(/[^a-f0-9]/) ) { 
echo "this is a hex string"; 
} else {
$base64[] = preg_replace('/[a-f0-9]$/', '');
}

当然,存在尾随 az/0-9 的问题,但这是一个起点。
您可以添加一些代码,其中从可疑的 base64 末尾计数到下一个 [g-zA-Z] 的开头,并查看该字符数是否能被 32 整除。如果是,那么您可能找到了所有你原来的base64。如果不是,您将不知道 'b' 是 b64 的结尾还是十六进制的开头,而 6 是十六进制的结尾还是 NEXT b64 的开头。

简而言之,这很愚蠢,让我很难过。

You could do it like:

read these 32 characters - if( preg_match(/[^a-f0-9]/) ) { 
echo "this is a hex string"; 
} else {
$base64[] = preg_replace('/[a-f0-9]$/', '');
}

Of course, there's the issue of the trailing a-z/0-9, but it's a starting point.
You could add some code in which counts from the end of your suspected base64 to the beginning of the next [g-zA-Z] and see if that number of characters is divisible by 32. If it is, then you probably found all of your original base64. If not, you won't have a clue if 'b' is the end of your b64, or the beginning of your hex, and 6 is the end of your hex, or beginning of your NEXT b64.

In short, this is stupid and it makes me sad.

朮生 2024-08-19 10:18:47

Base64 解码到每个决策点(接下来的 32 字节 Base64 或十六进制)可能会携带线索。

还有极可能将这些十六进制字符串之一解释为 base64总是,无论正在解码的内容如何,​​都会产生容易检测到的垃圾。

否则你就不走运了。

There is the possibility that base64 decoding up to each decision point (next 32 bytes base64 or hex) might carry the clue.

There's also the most minute chance that interpreting one of those hex strings as base64 always yields easily detected garbage for whatever is being decoded.

Otherwise you're out of luck.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文