PHP奇怪的按位运算符对字符串的影响

发布于 2024-11-16 03:26:51 字数 2708 浏览 2 评论 0原文

更新..移至新问题

好吧,读完 PHP 文档后,现在这些按位运算符就很清楚了,但是,呵呵,这是什么?

#dump1
var_dump('two identical strings' | 'two identical strings'); // mind the |
// string(21) "two identical strings"

#dump2
var_dump('two identical strings' ^ 'two identical strings'); // mind the ^
// string(21) ""

为什么 #dump2 显示 length == 21,但是0 个字符?

当在 Notepad++ 中复制时,字符串中没有任何字符的符号,那么,为什么 strlen > 是 0 个字符? 0? - 这让我很困惑,因为Notepad++可以显示某种位级(至少我认为那些是位级的,如果我错了请纠正我)字符,见图: 在此处输入图像描述

这实际上是来自:

$string = 'i want you to compare me with an even longer string, that contains even more data and some HTML characters, like € ' ^ 'And I am going to add some HTML characters, like € again to this side and see what happens'; // mind the ^
var_dump(htmlentities($string)); // had to add htmlentities, otherwise &gt; (<) characters are added in this case, therefore messing up the output - Chrome force closes `<body>` then
// string(101) "(NA'TAOOGCP MI<<m-NC C IRLIHYRSAGTNHEI   RNAEAAOP81#?"

我很乐意看到这个与 #dump2 相关的问题得到解答,预先感谢!


在实验时,我发现了这一点:

echo 'one' | 'two'; 
// returns 'o'

echo 'one' | 'twoe';
// returns 'oe'

所以,看到在这两行中它只返回两个字符串中的字母,我认为它会进行一些比较或其他操作:

echo 'i want you to compare me' | 'compare me with this';    
#crazy-line // returns 'koqoveyou wotko}xise me'

在写这篇文章时,甚至更奇怪事情发生了。我复制了返回值,并将其粘贴到帖子文本区域后,当指针位于疯狂行的末尾时,它实际上是右侧的一个“空格”,而不是它应该在的位置。退格时,它会清除最后一个字符,但指针仍然向右一个“空格”。

这导致我在 Notepad++ 中复制这个值:
Notepad++ 内的返回值
而且,呵呵,正如您所看到的,该字符串中有一个“四四方方”字符,该字符不会显示在浏览器中(至少在我的 Chrome 浏览器中)。是的,当这个字符从该字符串中删除(通过退格)时,它会恢复正常 - 右侧不再有“空格”。

那么,首先,PHP 内部的 | 是什么,为什么会出现这种奇怪的行为?

什么是这个更奇怪的字符,看起来像一个盒子,并且不会显示在浏览器中?

我非常好奇为什么会发生这种情况,所以这里是另一个包含 HTML 实体的较长字符串的测试:

$string = 'i want you to compare me with an even longer string, that contains even more data and some HTML characters, like &euro; ' | 'And I am going to add some HTML characters, like &euro; again to this side and see what happens';
var_dump($string);
// returns string(120) "inwaota}owo}ogcopave omwi||mmncmwwoc|o~wmrl{wing}r{augcontuonwhmweimorendaweealawomepxuo characters, like € "

最后一个值包含 7 个“四四方方”字符。

Update.. moved to a new question.

Okay, after reading PHP documentation it's clear now with those bitwise operators, but, huh, what is this?

#dump1
var_dump('two identical strings' | 'two identical strings'); // mind the |
// string(21) "two identical strings"

#dump2
var_dump('two identical strings' ^ 'two identical strings'); // mind the ^
// string(21) ""

Why #dump2 shows that length == 21, but 0 characters?

When copied in Notepad++ there are no signs of characters inside the string, so, how come strlen > 0? - this confuses me, because Notepad++ can show some kind of bit-level (at least I think that those are bit-level, correct me if I'm wrong) characters, see picture:
enter image description here

This is actually result from:

$string = 'i want you to compare me with an even longer string, that contains even more data and some HTML characters, like € ' ^ 'And I am going to add some HTML characters, like € again to this side and see what happens'; // mind the ^
var_dump(htmlentities($string)); // had to add htmlentities, otherwise > (<) characters are added in this case, therefore messing up the output - Chrome force closes `<body>` then
// string(101) "(NA'TAOOGCP MI<<m-NC C IRLIHYRSAGTNHEI   RNAEAAOP81#?"

i'd love to see this #dump2 related question answered, thanks in advance!


While experimenting, I found out this:

echo 'one' | 'two'; 
// returns 'o'

echo 'one' | 'twoe';
// returns 'oe'

So, seeing that in those two lines it returns only letters which are in both strings, I was thinking it does some comparison or something:

echo 'i want you to compare me' | 'compare me with this';    
#crazy-line // returns 'koqoveyou wotko}xise me'

While writing this, even stranger stuff happened. I copied that returned value, and after pasting it into post textarea, when pointer is positioned at the end of crazy-line, it is actually one "space" to the right not where it should be. When backspacing, it clears last character, but pointer is still one "space" to the right.

That lead me to copy this value inside Notepad++:
returned value inside Notepad++
And, huh, as you can see there is a 'boxy' character within this string that doesn't show up inside browser (at least on mine, Chrome, latest). And yes, when this character is removed from that string (by backspacing), it returns back to normal - no more "space" to the right.

So, first, what is this | inside PHP, and why there is such a strange behavior?

And what is this even stranger character, that looks like a box and doesn't show up in browser?

I'm pretty damn curious why this is happening, so here is one more test with longer strings containing HTML entities:

$string = 'i want you to compare me with an even longer string, that contains even more data and some HTML characters, like € ' | 'And I am going to add some HTML characters, like € again to this side and see what happens';
var_dump($string);
// returns string(120) "inwaota}owo}ogcopave omwi||mmncmwwoc|o~wmrl{wing}r{augcontuonwhmweimorendaweealawomepxuo characters, like € "

Last value contains 7 of those 'boxy' characters.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

扛刀软妹 2024-11-23 03:26:51

这是一个按位或运算符。这里解释了它在字符串上的行为: http://php.net /manual/en/language.operators.bitwise.php#example-107

<?php
echo 12 ^ 9; // Outputs '5'

echo "12" ^ "9"; // Outputs the Backspace character (ascii 8)
                 // ('1' (ascii 49)) ^ ('9' (ascii 57)) = #8

echo "hallo" ^ "hello"; // Outputs the ascii values #0 #4 #0 #0 #0
                        // 'a' ^ 'e' = #4

echo 2 ^ "3"; // Outputs 1
              // 2 ^ ((int)"3") == 1

echo "2" ^ 3; // Outputs 1
              // ((int)"2") ^ 3 == 1
?>

That's a bitwise OR operator. It's behavior on strings is explained here: http://php.net/manual/en/language.operators.bitwise.php#example-107

<?php
echo 12 ^ 9; // Outputs '5'

echo "12" ^ "9"; // Outputs the Backspace character (ascii 8)
                 // ('1' (ascii 49)) ^ ('9' (ascii 57)) = #8

echo "hallo" ^ "hello"; // Outputs the ascii values #0 #4 #0 #0 #0
                        // 'a' ^ 'e' = #4

echo 2 ^ "3"; // Outputs 1
              // 2 ^ ((int)"3") == 1

echo "2" ^ 3; // Outputs 1
              // ((int)"2") ^ 3 == 1
?>
记忆之渊 2024-11-23 03:26:51

它是按位或运算符。

如果左手和右手都
参数是字符串,按位
操作员将在
字符的 ASCII 值。

来源: http://php.net/manual/en/language.operators.bitwise .php

its the bitwise OR operator.

If both the left-hand and right-hand
parameters are strings, the bitwise
operator will operate on the
characters' ASCII values.

Source: http://php.net/manual/en/language.operators.bitwise.php

誰ツ都不明白 2024-11-23 03:26:51

管道符 | 用于按位或比较:

http://www.php.net/manual/en/language.operators.bitwise.php

这是之前关于如何处理字符串的 SO 线程:

如何按位比较字符串

The pipe character | is used for bitwise inclusive OR comparisons:

http://www.php.net/manual/en/language.operators.bitwise.php

Here is a previous SO thread on how it handles strings:

How to Bitwise compare a String

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文