检测 PHP 中的 base64 编码?

发布于 2024-08-27 06:59:00 字数 135 浏览 12 评论 0原文

有没有办法检测 PHP 中的字符串是否经过 base64_encoded() ?

我们正在将一些存储从纯文本转换为 Base64,其中一部分存储在需要更新的 cookie 中。如果文本尚未编码,我想重置他们的 cookie,否则就不要管它。

Is there some way to detect if a string has been base64_encoded() in PHP?

We're converting some storage from plain text to base64 and part of it lives in a cookie that needs to be updated. I'd like to reset their cookie if the text has not yet been encoded, otherwise leave it alone.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(12

好菇凉咱不稀罕他 2024-09-03 06:59:00

对于对已经回答的问题的迟回复表示歉意,但我认为 base64_decode($x,true) 不是解决此问题的足够好的解决方案。事实上,可能没有一个非常好的解决方案可以针对任何给定的输入。例如,我可以将大量错误值放入 $x 中,而不会得到错误的返回值。

var_dump(base64_decode('wtf mate',true));
string(5) "���j�"

var_dump(base64_decode('This is definitely not base64 encoded',true));
string(24) "N���^~)��r��[jǺ��ܡם"

我认为除了严格的返回值检查之外,您还需要进行解码后验证。最可靠的方法是您可以解码然后对照一组已知的可能值进行检查。

精度低于 100% 的更通用的解决方案(对于较长的字符串更接近,对于短字符串不准确)是检查输出以查看是否有许多字符超出 utf-8(或您使用的任何编码)字符的正常范围。

看这个例子:

<?php
$english = array();
foreach (str_split('az019AZ~~~!@#$%^*()_+|}?><": Iñtërnâtiônàlizætiøn') as $char) {
  echo ord($char) . "\n";
  $english[] = ord($char);
}
  echo "Max value english = " . max($english) . "\n";

$nonsense = array();
echo "\n\nbase64:\n";
foreach (str_split(base64_decode('Not base64 encoded',true)) as $char) {
  echo ord($char) . "\n";
  $nonsense[] = ord($char);
}

  echo "Max nonsense = " . max($nonsense) . "\n";

?>

结果:

Max value english = 195
Max nonsense = 233

所以你可以这样做:

if ( $maxDecodedValue > 200 ) {} //decoded string is Garbage - original string not base64 encoded

else {} //decoded string is useful - it was base64 encoded

你可能应该使用解码值的mean()而不是max(),我在这个例子中只使用了max(),因为遗憾的是没有内置的PHP 中的平均值()。您针对阈值(例如 200)使用什么度量(平均值、最大值等)取决于您估计的使用情况。

总之,唯一的胜利之举就是不玩。我会尽量避免首先识别 Base64。

Apologies for a late response to an already-answered question, but I don't think base64_decode($x,true) is a good enough solution for this problem. In fact, there may not be a very good solution that works against any given input. For example, I can put lots of bad values into $x and not get a false return value.

var_dump(base64_decode('wtf mate',true));
string(5) "���j�"

var_dump(base64_decode('This is definitely not base64 encoded',true));
string(24) "N���^~)��r��[jǺ��ܡם"

I think that in addition to the strict return value check, you'd also need to do post-decode validation. The most reliable way is if you could decode and then check against a known set of possible values.

A more general solution with less than 100% accuracy (closer with longer strings, inaccurate for short strings) is if you check your output to see if many are outside of a normal range of utf-8 (or whatever encoding you use) characters.

See this example:

<?php
$english = array();
foreach (str_split('az019AZ~~~!@#$%^*()_+|}?><": Iñtërnâtiônàlizætiøn') as $char) {
  echo ord($char) . "\n";
  $english[] = ord($char);
}
  echo "Max value english = " . max($english) . "\n";

$nonsense = array();
echo "\n\nbase64:\n";
foreach (str_split(base64_decode('Not base64 encoded',true)) as $char) {
  echo ord($char) . "\n";
  $nonsense[] = ord($char);
}

  echo "Max nonsense = " . max($nonsense) . "\n";

?>

Results:

Max value english = 195
Max nonsense = 233

So you may do something like this:

if ( $maxDecodedValue > 200 ) {} //decoded string is Garbage - original string not base64 encoded

else {} //decoded string is useful - it was base64 encoded

You should probably use the mean() of the decoded values instead of the max(), I just used max() in this example because there is sadly no built-in mean() in PHP. What measure you use (mean,max, etc) against what threshold (eg 200) depends on your estimated usage profile.

In conclusion, the only winning move is not to play. I'd try to avoid having to discern base64 in the first place.

月光色 2024-09-03 06:59:00
function is_base64_encoded($data)
{
    if (preg_match('%^[a-zA-Z0-9/+]*={0,2}$%', $data)) {
       return TRUE;
    } else {
       return FALSE;
    }
};

is_base64_encoded("iash21iawhdj98UH3"); // true
is_base64_encoded("#iu3498r"); // false
is_base64_encoded("asiudfh9w=8uihf"); // false
is_base64_encoded("a398UIhnj43f/1!+sadfh3w84hduihhjw=="); // false

http://php.net/manual/en/function.base64-decode .php#81425

function is_base64_encoded($data)
{
    if (preg_match('%^[a-zA-Z0-9/+]*={0,2}$%', $data)) {
       return TRUE;
    } else {
       return FALSE;
    }
};

is_base64_encoded("iash21iawhdj98UH3"); // true
is_base64_encoded("#iu3498r"); // false
is_base64_encoded("asiudfh9w=8uihf"); // false
is_base64_encoded("a398UIhnj43f/1!+sadfh3w84hduihhjw=="); // false

http://php.net/manual/en/function.base64-decode.php#81425

梦里泪两行 2024-09-03 06:59:00

我遇到了同样的问题,我最终得到了这个解决方案:

if ( base64_encode(base64_decode($data)) === $data){
    echo '$data is valid';
} else {
    echo '$data is NOT valid';
}

I had the same problem, I ended up with this solution:

if ( base64_encode(base64_decode($data)) === $data){
    echo '$data is valid';
} else {
    echo '$data is NOT valid';
}
江湖彼岸 2024-09-03 06:59:00

迟到总比不到好:您可以使用 mb_detect_encoding() 来查明编码的字符串是否看起来是某种文本:

function is_base64_string($s) {
  // first check if we're dealing with an actual valid base64 encoded string
  if (($b = base64_decode($s, TRUE)) === FALSE) {
    return FALSE;
  }

  // now check whether the decoded data could be actual text
  $e = mb_detect_encoding($b);
  if (in_array($e, array('UTF-8', 'ASCII'))) { // YMMV
    return TRUE;
  } else {
    return FALSE;
  }
}

更新 对于那些喜欢简短的人

function is_base64_string_s($str, $enc=array('UTF-8', 'ASCII')) {
  return !(($b = base64_decode($str, TRUE)) === FALSE) && in_array(mb_detect_encoding($b), $enc);
}

Better late than never: You could maybe use mb_detect_encoding() to find out whether the encoded string appears to have been some kind of text:

function is_base64_string($s) {
  // first check if we're dealing with an actual valid base64 encoded string
  if (($b = base64_decode($s, TRUE)) === FALSE) {
    return FALSE;
  }

  // now check whether the decoded data could be actual text
  $e = mb_detect_encoding($b);
  if (in_array($e, array('UTF-8', 'ASCII'))) { // YMMV
    return TRUE;
  } else {
    return FALSE;
  }
}

UPDATE For those who like it short

function is_base64_string_s($str, $enc=array('UTF-8', 'ASCII')) {
  return !(($b = base64_decode($str, TRUE)) === FALSE) && in_array(mb_detect_encoding($b), $enc);
}
古镇旧梦 2024-09-03 06:59:00

我们可以将三件事组合成一个函数来检查给定的字符串是否是有效的 Base 64 编码。

function validBase64($string)
{
 $decoded = base64_decode($string, true);
 $result = false;
    
 // Check if there is no invalid character in string
 if (!preg_match('/^[a-zA-Z0-9\/\r\n+]*={0,2}$/', $string)) {$result = false;}
        
 // Decode the string in strict mode and send the response
 if (!$decoded) {$result = false;}
        
 // Encode and compare it to original one
 if (base64_encode($decoded) != $string) {$result = false;}
        
 return $result;
}

We can combine three things into one function to check if given string is a valid base 64 encoded or not.

function validBase64($string)
{
 $decoded = base64_decode($string, true);
 $result = false;
    
 // Check if there is no invalid character in string
 if (!preg_match('/^[a-zA-Z0-9\/\r\n+]*={0,2}$/', $string)) {$result = false;}
        
 // Decode the string in strict mode and send the response
 if (!$decoded) {$result = false;}
        
 // Encode and compare it to original one
 if (base64_encode($decoded) != $string) {$result = false;}
        
 return $result;
}
笔落惊风雨 2024-09-03 06:59:00

我正要在 php 中构建一个 base64 切换,这就是我所做的:

function base64Toggle($str) {
    if (!preg_match('~[^0-9a-zA-Z+/=]~', $str)) {
        $check = str_split(base64_decode($str));
        $x = 0;
        foreach ($check as $char) if (ord($char) > 126) $x++;
        if ($x/count($check)*100 < 30) return base64_decode($str);
    }
    return base64_encode($str);
}

它非常适合我。
以下是我对此的完整想法: http://www.albertmartin.de /blog/code.php/19/base64-detection

在这里你可以尝试一下:http:// www.albertmartin.de/tools

I was about to build a base64 toggle in php, this is what I did:

function base64Toggle($str) {
    if (!preg_match('~[^0-9a-zA-Z+/=]~', $str)) {
        $check = str_split(base64_decode($str));
        $x = 0;
        foreach ($check as $char) if (ord($char) > 126) $x++;
        if ($x/count($check)*100 < 30) return base64_decode($str);
    }
    return base64_encode($str);
}

It works perfectly for me.
Here are my complete thoughts on it: http://www.albertmartin.de/blog/code.php/19/base64-detection

And here you can try it: http://www.albertmartin.de/tools

游魂 2024-09-03 06:59:00

如果输入不是有效的 Base64 编码数据,base64_decode() 将不会返回 FALSE。使用 imap_base64() 代替,如果 $text 包含 Base64 字母表之外的字符,则返回 FALSE
imap_base64() 参考

base64_decode() will not return FALSE if the input is not valid base64 encoded data. Use imap_base64() instead, it returns FALSE if $text contains characters outside the Base64 alphabet
imap_base64() Reference

旧话新听 2024-09-03 06:59:00

这是我的解决方案:

if(empty(htmlspecialchars(base64_decode($string, true)))) {
返回假;
}

如果解码后的$string无效,则返回 false,例如:“node”、“123”、“ ”等。

Here's my solution:

if(empty(htmlspecialchars(base64_decode($string, true)))) {
return false;
}

It will return false if the decoded $string is invalid, for example: "node", "123", " ", etc.

墟烟 2024-09-03 06:59:00
$is_base64 = function(string $string) : bool {
    $zero_one = ['MA==', 'MQ=='];
    if (in_array($string, $zero_one)) return TRUE;

    if (empty(htmlspecialchars(base64_decode($string, TRUE))))
        return FALSE;

    return TRUE;
};

var_dump('*** These yell false ***');
var_dump($is_base64(''));
var_dump($is_base64('This is definitely not base64 encoded'));
var_dump($is_base64('node'));
var_dump($is_base64('node '));
var_dump($is_base64('123'));
var_dump($is_base64(0));
var_dump($is_base64(1));
var_dump($is_base64(123));
var_dump($is_base64(1.23));

var_dump('*** These yell true ***');
var_dump($is_base64(base64_encode('This is definitely base64 encoded')));
var_dump($is_base64(base64_encode('node')));
var_dump($is_base64(base64_encode('123')));
var_dump($is_base64(base64_encode(0)));
var_dump($is_base64(base64_encode(1)));
var_dump($is_base64(base64_encode(123)));
var_dump($is_base64(base64_encode(1.23)));
var_dump($is_base64(base64_encode(TRUE)));

var_dump('*** Should these yell true? Might be edge cases ***');
var_dump($is_base64(base64_encode('')));
var_dump($is_base64(base64_encode(FALSE)));
var_dump($is_base64(base64_encode(NULL)));
$is_base64 = function(string $string) : bool {
    $zero_one = ['MA==', 'MQ=='];
    if (in_array($string, $zero_one)) return TRUE;

    if (empty(htmlspecialchars(base64_decode($string, TRUE))))
        return FALSE;

    return TRUE;
};

var_dump('*** These yell false ***');
var_dump($is_base64(''));
var_dump($is_base64('This is definitely not base64 encoded'));
var_dump($is_base64('node'));
var_dump($is_base64('node '));
var_dump($is_base64('123'));
var_dump($is_base64(0));
var_dump($is_base64(1));
var_dump($is_base64(123));
var_dump($is_base64(1.23));

var_dump('*** These yell true ***');
var_dump($is_base64(base64_encode('This is definitely base64 encoded')));
var_dump($is_base64(base64_encode('node')));
var_dump($is_base64(base64_encode('123')));
var_dump($is_base64(base64_encode(0)));
var_dump($is_base64(base64_encode(1)));
var_dump($is_base64(base64_encode(123)));
var_dump($is_base64(base64_encode(1.23)));
var_dump($is_base64(base64_encode(TRUE)));

var_dump('*** Should these yell true? Might be edge cases ***');
var_dump($is_base64(base64_encode('')));
var_dump($is_base64(base64_encode(FALSE)));
var_dump($is_base64(base64_encode(NULL)));
锦欢 2024-09-03 06:59:00

可能这并不完全是您所要求的。但希望它对某人有用。

就我而言,解决方案是使用 json_encode 然后使用 base64_encode 对所有数据进行编码。

$encoded=base64_encode(json_encode($data));

该值可以根据您的需要进行存储或使用。
然后检查这个值是否不仅仅是一个文本字符串,而是您简单使用的编码数据

function isData($test_string){
   if(base64_decode($test_string,true)&&json_decode(base64_decode($test_string))){
      return true;
   }else{
    return false;
   }

,或者

function isNotData($test_string){
   if(base64_decode($test_string,true)&&json_decode(base64_decode($test_string))){
      return false;
   }else{
    return true;
   }

感谢该线程中所有先前答案的作者:)

May be it's not exactly what you've asked for. But hope it'll be usefull for somebody.

In my case the solution was to encode all data with json_encode and then base64_encode.

$encoded=base64_encode(json_encode($data));

this value could be stored or used whatever you need.
Then to check if this value isn't just a text string but your data encoded you simply use

function isData($test_string){
   if(base64_decode($test_string,true)&&json_decode(base64_decode($test_string))){
      return true;
   }else{
    return false;
   }

or alternatively

function isNotData($test_string){
   if(base64_decode($test_string,true)&&json_decode(base64_decode($test_string))){
      return false;
   }else{
    return true;
   }

Thanks to all previous answers authors in this thread:)

倾`听者〃 2024-09-03 06:59:00

通常,base64 中的文本没有空格。

我使用了这个功能,对我来说效果很好。它测试字符串中的空格数是否小于 20 中的 1 个。

例如:每 20 个字符至少有 1 个空格 --- ( space / strlen ) < 0.05

function normalizaBase64($data){
    $spaces = substr_count ( $data ," ");
    if (($spaces/strlen($data))<0.05)
    {
        return base64_decode($data);
    }
    return $data;
}

Usually a text in base64 has no spaces.

I used this function which worked fine for me. It tests if the number of spaces in the string is less than 1 in 20.

e.g: at least 1 space for each 20 chars --- ( spaces / strlen ) < 0.05

function normalizaBase64($data){
    $spaces = substr_count ( $data ," ");
    if (($spaces/strlen($data))<0.05)
    {
        return base64_decode($data);
    }
    return $data;
}
绮筵 2024-09-03 06:59:00

您最好的选择是:

$base64_test = mb_substr(trim($some_base64_data), 0, 76);
return (base64_decode($base64_test, true) === FALSE ? FALSE : TRUE);

Your best option is:

$base64_test = mb_substr(trim($some_base64_data), 0, 76);
return (base64_decode($base64_test, true) === FALSE ? FALSE : TRUE);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文