php真正的多字节字符串洗牌功能?

发布于 2024-10-26 16:13:46 字数 418 浏览 6 评论 0原文

我对多字节字符串有一个独特的问题,需要能够以一定程度的随机性对 PHP 中的长 UTF-8 编码多字节字符串进行洗牌,而不会丢失、丢失或重复任何字符。

在 str_shuffle 下的 PHP 手册中,有一个不起作用的多字节函数(第一个用户提交的函数):如果我使用一个字符串,例如字符串长度(例如)120 个字符的所有日语平假名和片假名,我am 返回一个包含 119 个字符或 118 个字符的字符串。有时我会看到重复的字符,即使原始字符串没有它们。所以这不起作用。

为了使事情变得更复杂,如果可能的话,我还需要包含日语 UTF-8 换行符、换行符和标点符号。

任何具有使用 UTF-8 mb 字符串处理多种语言经验的人都可以提供帮助吗? PHP 有内置函数可以做到这一点吗? str_shuffle 正是我想要的。我只需要它也能处理多字节字符。

非常感谢!

I have a unique problem with multibyte character strings and need to be able to shuffle, with some fair degree of randomness, a long UTF-8 encoded multibyte string in PHP without dropping or losing or repeating any of the characters.

In the PHP manual under str_shuffle there is a multi-byte function (the first user submitted one) that doesn't work: If I use a string with for example all the Japanese hiragana and katakana of string length (ex) 120 chars, I am returned a string that's 119 chars or 118 chars. Sometimes I've seen duplicate chars even though the original string doesn't have them. So that's not functional.

To make this more complex, I also need to include if possible Japanese UTF-8 newlines and line feeds and punctuation.

Can anyone with experience dealing in multiple languages with UTF-8 mb strings help? Does PHP have any built in functions to do this? str_shuffle is EXACTLY what I want. I just need it to also work on multibyte chars.

Thanks very much!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

等风来 2024-11-02 16:13:51

我喜欢使用这个函数:

function mb_str_shuffle($multibyte_string = "abcčćdđefghijklmnopqrsštuvwxyzžß,.-+'*?=)(/&%$#!~ˇ^˘°˛`˙´˝") {
    $characters_array = mb_str_split($multibyte_string);
    shuffle($characters_array);
    return implode('', $characters_array); // or join('', $characters_array); if you have a death wish (JK)
}
  1. 将字符串拆分为多字节字符数组
  2. 对不关心其居民是多字节的好人数组进行洗牌
  3. 将洗牌后的数组一起加入到一个字符串

当然,我通常不会有函数参数的默认值。

I like to use this function:

function mb_str_shuffle($multibyte_string = "abcčćdđefghijklmnopqrsštuvwxyzžß,.-+'*?=)(/&%$#!~ˇ^˘°˛`˙´˝") {
    $characters_array = mb_str_split($multibyte_string);
    shuffle($characters_array);
    return implode('', $characters_array); // or join('', $characters_array); if you have a death wish (JK)
}
  1. Split string into an array of multibyte characters
  2. Shuffle the good guy array who doesn't care about his residents being multibyte
  3. Join the shuffled array together into a string

Of course I normally wouldn't have a default value for function's parameter.

征﹌骨岁月お 2024-11-02 16:13:50

尝试使用 mb_strlenmb_substr 创建一个数组,然后在之前使用 shuffle再次将其重新组合在一起。 (编辑:正如 @Frosty Z 的答案中所演示的那样。)

PHP 交互式提示的示例:

php > $string = "Pretend I'm multibyte!";
php > $len = mb_strlen($string);
php > $sploded = array(); 
php > while($len-- > 0) { $sploded[] = mb_substr($string, $len, 1); }
php > shuffle($sploded);
php > echo join('', $sploded);
rmedt tmu nIb'lyi!eteP

您需要确保在适当的情况下指定编码。

Try splitting the string using mb_strlen and mb_substr to create an array, then using shuffle before joining it back together again. (Edit: As also demonstrated in @Frosty Z's answer.)

An example from the PHP interactive prompt:

php > $string = "Pretend I'm multibyte!";
php > $len = mb_strlen($string);
php > $sploded = array(); 
php > while($len-- > 0) { $sploded[] = mb_substr($string, $len, 1); }
php > shuffle($sploded);
php > echo join('', $sploded);
rmedt tmu nIb'lyi!eteP

You'll want to be sure to specify the encoding, where appropriate.

你的他你的她 2024-11-02 16:13:50

这也应该能达到目的。我希望。

class String
{

    public function mbStrShuffle($string)
    {
        $chars = $this->mbGetChars($string);
        shuffle($chars);
        return implode('', $chars);
    }

    public function mbGetChars($string)
    {
        $chars = [];

        for($i = 0, $length = mb_strlen($string); $i < $length; ++$i)
        {
            $chars[] = mb_substr($string, $i, 1, 'UTF-8');
        }

        return $chars;
    }

}

This should do the trick, too. I hope.

class String
{

    public function mbStrShuffle($string)
    {
        $chars = $this->mbGetChars($string);
        shuffle($chars);
        return implode('', $chars);
    }

    public function mbGetChars($string)
    {
        $chars = [];

        for($i = 0, $length = mb_strlen($string); $i < $length; ++$i)
        {
            $chars[] = mb_substr($string, $i, 1, 'UTF-8');
        }

        return $chars;
    }

}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文