使用 JavaScript 确定字符串是否为 base64

发布于 2024-12-11 12:03:11 字数 144 浏览 0 评论 0原文

我正在使用 window.atob('string') 函数将字符串从 Base64 解码为字符串。现在我想知道,有什么方法可以检查“字符串”实际上是有效的 base64 吗?如果字符串不是 base64,我希望收到通知,以便我可以执行不同的操作。

I'm using the window.atob('string') function to decode a string from base64 to a string. Now I wonder, is there any way to check that 'string' is actually valid base64? I would like to be notified if the string is not base64 so I can perform a different action.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(14

寒江雪… 2024-12-18 12:03:11

基于 @anders-marzi-tornblad 的答案< /a>,使用正则表达式对 Base64 有效性进行简单的真/假测试,如下所示:

var base64regex = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;

base64regex.test("SomeStringObviouslyNotBase64Encoded...");             // FALSE
base64regex.test("U29tZVN0cmluZ09idmlvdXNseU5vdEJhc2U2NEVuY29kZWQ=");   // TRUE

更新 2021

  • 遵循下面的评论事实证明,这种基于正则表达式的解决方案提供了比简单地 try`ing atob 更准确的检查,因为后者不检查 =-padding。根据 RFC4648 = - 填充可能仅对于 base16 编码或隐式已知数据长度时将被忽略。
  • 基于正则表达式的解决方案似乎也是最快的 kai提示。由于 jsperf 看起来不稳定,我在 jsbench 上做了一个 新测试 证实了这一点。

Building on @anders-marzi-tornblad's answer, using the regex to make a simple true/false test for base64 validity is as easy as follows:

var base64regex = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;

base64regex.test("SomeStringObviouslyNotBase64Encoded...");             // FALSE
base64regex.test("U29tZVN0cmluZ09idmlvdXNseU5vdEJhc2U2NEVuY29kZWQ=");   // TRUE

Update 2021

  • Following the comments below it transpires this regex-based solution provides a more accurate check than simply try`ing atob because the latter doesn't check for =-padding. According to RFC4648 =-padding may only be ignored for base16-encoding or if the data length is known implicitely.
  • Regex-based solution also seems to be the fastest as hinted by kai. As jsperf seems flaky atm i made a new test on jsbench which confirms this.
国粹 2024-12-18 12:03:11

如果你想检查是否可以解码,你可以简单地尝试解码,看看是否失败:

try {
    window.atob(str);
} catch(e) {
    // something failed

    // if you want to be specific and only catch the error which means
    // the base 64 was invalid, then check for 'e.code === 5'.
    // (because 'DOMException.INVALID_CHARACTER_ERR === 5')
}

If you want to check whether it can be decoded or not, you can simply try decoding it and see whether it failed:

try {
    window.atob(str);
} catch(e) {
    // something failed

    // if you want to be specific and only catch the error which means
    // the base 64 was invalid, then check for 'e.code === 5'.
    // (because 'DOMException.INVALID_CHARACTER_ERR === 5')
}
默嘫て 2024-12-18 12:03:11

这应该可以解决问题。

function isBase64(str) {
    if (str ==='' || str.trim() ===''){ return false; }
    try {
        return btoa(atob(str)) == str;
    } catch (err) {
        return false;
    }
}

This should do the trick.

function isBase64(str) {
    if (str ==='' || str.trim() ===''){ return false; }
    try {
        return btoa(atob(str)) == str;
    } catch (err) {
        return false;
    }
}
淡紫姑娘! 2024-12-18 12:03:11

如果“有效”意味着“其中仅包含 base64 字符”,则检查 /[A-Za-z0-9+/=]/

如果“有效”意味着“合法”的 base64 编码字符串,那么您应该检查末尾的 =

如果“有效”意味着解码后是合理的,那么它需要领域知识。

If "valid" means "only has base64 chars in it" then check against /[A-Za-z0-9+/=]/.

If "valid" means a "legal" base64-encoded string then you should check for the = at the end.

If "valid" means it's something reasonable after decoding then it requires domain knowledge.

痴情换悲伤 2024-12-18 12:03:11

我会为此使用正则表达式。试试这个:

/^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/

说明:

^                          # Start of input
([0-9a-zA-Z+/]{4})*        # Groups of 4 valid characters decode
                           # to 24 bits of data for each group
(                          # Either ending with:
    ([0-9a-zA-Z+/]{2}==)   # two valid characters followed by ==
    |                      # , or
    ([0-9a-zA-Z+/]{3}=)    # three valid characters followed by =
)?                         # , or nothing
$                          # End of input

I would use a regular expression for that. Try this one:

/^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/

Explanation:

^                          # Start of input
([0-9a-zA-Z+/]{4})*        # Groups of 4 valid characters decode
                           # to 24 bits of data for each group
(                          # Either ending with:
    ([0-9a-zA-Z+/]{2}==)   # two valid characters followed by ==
    |                      # , or
    ([0-9a-zA-Z+/]{3}=)    # three valid characters followed by =
)?                         # , or nothing
$                          # End of input
枫以 2024-12-18 12:03:11

该方法尝试解码然后编码并与原始数据进行比较。还可以与引发解析错误的环境的其他答案相结合。从正则表达式的角度来看,也可能有一个看起来像有效的 Base64 的字符串,但不是实际的 Base64。

if(btoa(atob(str))==str){
  //...
}

This method attempts to decode then encode and compare to the original. Could also be combined with the other answers for environments that throw on parsing errors. Its also possible to have a string that looks like valid base64 from a regex point of view but is not actual base64.

if(btoa(atob(str))==str){
  //...
}
少女净妖师 2024-12-18 12:03:11

这就是我最喜欢的验证库之一的实现方式:

const notBase64 = /[^A-Z0-9+\/=]/i;

export default function isBase64(str) {
  assertString(str); // remove this line and make sure you pass in a string
  const len = str.length;
  if (!len || len % 4 !== 0 || notBase64.test(str)) {
    return false;
  }
  const firstPaddingChar = str.indexOf('=');
  return firstPaddingChar === -1 ||
    firstPaddingChar === len - 1 ||
    (firstPaddingChar === len - 2 && str[len - 1] === '=');
}

https://github.com/chriso/validator.js/blob/master/src/lib/isBase64.js

This is how it's done in one of my favorite validation libs:

const notBase64 = /[^A-Z0-9+\/=]/i;

export default function isBase64(str) {
  assertString(str); // remove this line and make sure you pass in a string
  const len = str.length;
  if (!len || len % 4 !== 0 || notBase64.test(str)) {
    return false;
  }
  const firstPaddingChar = str.indexOf('=');
  return firstPaddingChar === -1 ||
    firstPaddingChar === len - 1 ||
    (firstPaddingChar === len - 2 && str[len - 1] === '=');
}

https://github.com/chriso/validator.js/blob/master/src/lib/isBase64.js

盗琴音 2024-12-18 12:03:11

对我来说,如果满足以下条件,则字符串可能是编码的 Base64:

  1. 其长度可被 4 整除
  2. 使用 AZ az 0-9 +/ =
  3. 仅在末尾使用 = (0-2 个字符),

因此代码为

function isBase64(str)
{
    return str.length % 4 == 0 && /^[A-Za-z0-9+/]+[=]{0,2}$/.test(str);
}

For me, a string is likely an encoded base64 if:

  1. its length is divisible by 4
  2. uses A-Z a-z 0-9 +/=
  3. only uses = in the end (0-2 chars)

so the code would be

function isBase64(str)
{
    return str.length % 4 == 0 && /^[A-Za-z0-9+/]+[=]{0,2}$/.test(str);
}
£噩梦荏苒 2024-12-18 12:03:11

在 Nodejs 中实现(不仅验证允许的字符,还验证 Base64 字符串)


    const validateBase64 = function(encoded1) {
        var decoded1 = Buffer.from(encoded1, 'base64').toString('utf8');
        var encoded2 = Buffer.from(decoded1, 'binary').toString('base64');
        return encoded1 == encoded2;
    }

Implementation in nodejs (validates not just allowed chars but base64 string at all)


    const validateBase64 = function(encoded1) {
        var decoded1 = Buffer.from(encoded1, 'base64').toString('utf8');
        var encoded2 = Buffer.from(decoded1, 'binary').toString('base64');
        return encoded1 == encoded2;
    }

贵在坚持 2024-12-18 12:03:11

我已经尝试过以下答案,但存在一些问题。

var base64regex = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;
base64regex.test(value)

当使用这个时,“BBBBB”大写字母将是正确的。 “4444”也是如此。

我添加了一些代码以使其正常工作。

function (value) {
  var base64regex = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;
  if (base64regex.test(value) && isNaN(value) && !/^[a-zA-Z]+$/.test(value)) {
  return decodeURIComponent(escape(window.atob(value)));
}

I have tried the below answers but there are some issues.

var base64regex = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;
base64regex.test(value)

when using this it will be true with "BBBBB" capital letters. and also it will be true with "4444".

I added some code to work correctly for me.

function (value) {
  var base64regex = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;
  if (base64regex.test(value) && isNaN(value) && !/^[a-zA-Z]+$/.test(value)) {
  return decodeURIComponent(escape(window.atob(value)));
}
孤独患者 2024-12-18 12:03:11

把我的结果扔到这里来讨论。
在我的例子中,有一个字符串不是 Base64,但它是有效的 Base64,因此它被解码为乱码。 (即 yyyyyyyy 根据通常的正则表达式是有效的 base64)

我的测试结果是首先使用其他人共享的正则表达式检查该字符串是否是有效的 base64 字符串,然后解密它并测试它是否是有效的 ascii 字符串,因为(在我的情况下) ) 我应该只取回 ASCII 字符。 (这可能可以扩展到包括可能不属于 ascii 范围的其他字符。)

这是多个答案的混合。

let base64regex = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;
function isBase64(str) {
    if (str ==='' || str.trim() ===''){ return false; }
    try {
        if (base64regex.test(str)) {
            return /^[\x00-\x7F]*$/.test(atob(str));
        } else {
            return false
        }
    } catch (err) {
        // catch
    }
}

与我的 JavaScript 答案一样,我不知道自己在做什么。所以可能有更好的方法来写出来。但它可以满足我的需求,并且涵盖了当您有一个不应该是 base64 但有效并且仍然解密为 base64 的字符串时的情况。

Throwing my results into the fray here.
In my case, there was a string that was not base64 but was valid base64 so it was getting decoded into gibberish. (i.e. yyyyyyyy is valid base64 according to the usual regex)

My testing resulted in checking first if the string was a valid base64 string using the regex others shared here and then decrypting it and testing if it was a valid ascii string since (in my case) I should only get ascii characters back. (This can probably be extended to include other characters that may not fall into ascii ranges.)

This is a bit of a mix of multiple answers.

let base64regex = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;
function isBase64(str) {
    if (str ==='' || str.trim() ===''){ return false; }
    try {
        if (base64regex.test(str)) {
            return /^[\x00-\x7F]*$/.test(atob(str));
        } else {
            return false
        }
    } catch (err) {
        // catch
    }
}

As always with my JavaScript answers, I have no idea what I am doing. So there might be a better way to write this out. But it works for my needs and covers the case when you have a string that isn't supposed to be base64 but is valid and still decrypts as base64.

滥情空心 2024-12-18 12:03:11

尝试下面的代码,其中 str 是您要检查的字符串。

Buffer.from(str, 'base64').toString('base64') === str

Try the code below, where str is the string you want to check.

Buffer.from(str, 'base64').toString('base64') === str
征棹 2024-12-18 12:03:11

当你用“演示”这样的词测试它们时,所有答案都是错误的

function isBase64(str) {
            try {
                return btoa(atob(str)) === str;
            } catch (e) {
                return false;
            }
        }
console.log(isBase64("demo"))

所以我问副驾驶,这就是答案:

function mightBeBase64(str) {
// Base64 strings are usually a multiple of 4 in length
if (str.length % 4 !== 0) {
return false;
}

// Check for base64 character set
if (!/^[A-Za-z0-9+/]+={0,2}$/.test(str)) {
return false;
}

// Attempt to decode and check if the result is a valid string
try {
const decoded = atob(str);
// Check if the decoded string contains only printable characters
if (/^[\x20-\x7E]*$/.test(decoded)) {
return true;
}
} catch (e) {
return false;
}

return false;
}

// Example usage:
console.log(mightBeBase64("demo")); // Should return false
console.log(mightBeBase64("SGVsbG8sIENvcGlsb3Qh")); // Should return true

All Answer Are Wrong when you test them with word like "demo"

function isBase64(str) {
            try {
                return btoa(atob(str)) === str;
            } catch (e) {
                return false;
            }
        }
console.log(isBase64("demo"))

so i asked Copilot And this is THE ANSWER :

function mightBeBase64(str) {
// Base64 strings are usually a multiple of 4 in length
if (str.length % 4 !== 0) {
return false;
}

// Check for base64 character set
if (!/^[A-Za-z0-9+/]+={0,2}$/.test(str)) {
return false;
}

// Attempt to decode and check if the result is a valid string
try {
const decoded = atob(str);
// Check if the decoded string contains only printable characters
if (/^[\x20-\x7E]*$/.test(decoded)) {
return true;
}
} catch (e) {
return false;
}

return false;
}

// Example usage:
console.log(mightBeBase64("demo")); // Should return false
console.log(mightBeBase64("SGVsbG8sIENvcGlsb3Qh")); // Should return true

猫弦 2024-12-18 12:03:11

我知道已经晚了,但我试图在这里让它变得简单;

function isBase64(encodedString) {
    var regexBase64 = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;
    return regexBase64.test(encodedString);   // return TRUE if its base64 string.
}

I know its late, but I tried to make it simple here;

function isBase64(encodedString) {
    var regexBase64 = /^([0-9a-zA-Z+/]{4})*(([0-9a-zA-Z+/]{2}==)|([0-9a-zA-Z+/]{3}=))?$/;
    return regexBase64.test(encodedString);   // return TRUE if its base64 string.
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文