PHP preg_replace 不起作用/奇怪的字符串类型/字符数没有相加
我想从字符串中删除以下指定的字符:
<
&
"
#
%
所以类似:
Test%#"&<value
应该变成:
Testvalue
字符的顺序无关紧要。
字符串类型有一些奇怪的地方。
测试字符串如下所示:
var_dump($test): string(25) "A BCDEFG/<&"#%/HI"
字符数加起来不等于 25,我不知道为什么。
如果我这样做:
$displayName = strtr($displayName, array('<' => '', '&' => '', '"' => '', '#' => '', '%' => ''));
我得到:
string(20) "A BCDEFG/lt;quot;/HI"
I want to remove the following specified characters from a string:
<
&
"
#
%
So something like:
Test%#"&<value
should become:
Testvalue
The order of characters is immaterial.
There's something weird about the string type.
A test string looks like this:
var_dump($test): string(25) "A BCDEFG/<&"#%/HI"
The number of characters is NOT adding up to 25 and I'm not sure why.
If I do:
$displayName = strtr($displayName, array('<' => '', '&' => '', '"' => '', '#' => '', '%' => ''));
I get:
string(20) "A BCDEFG/lt;quot;/HI"
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
转义
<
将工作:但是周围的
/
用作preg_*
系列的分隔符。因此,您也必须执行以下操作来删除它们:(这里我使用
|
作为分隔符,因为该字符不是表达式本身的一部分。)编辑
如果您只想替换字符
<
、&
、"
、#
和 < code>%,这可能更可取:编辑 2
后来发生了很多混乱,似乎
$displayName
字符串实际上包含A BCDEFG/<&"#%/HI
。在这种情况下,您可以直接替换 HTML 实体(未经测试):Escaping the
<
will work:However the surrounding
/
serve as delimiters for thepreg_*
family. Therefore, you must do the following to remove these, too:(Here I use
|
as delimiter instead since this character is not part of the expression itself.)EDIT
If you're just interested in replacing the characters
<
,&
,"
,#
, and%
, this is probably preferable:EDIT 2
A great deal of confusion later, it seems that the
$displayName
string did actually containA BCDEFG/<&"#%/HI
. In that case you could replace the HTML entities directly (untested):如果你想用正则表达式来做到这一点:
这输出:
If you want to do it with regular expressions:
This outputs:
关于“字符数未加起来”部分:过去确实是一个字符就是一个字节,但你不能再指望这一点了。我的猜测是 var_dump() 显示字符串内部有多少字节,您确实不应该关心这一点。
当使用高级语言处理字符串时,您确实应该专注于字符数量,而忘记字符串有多少字节。 (当然也有例外;)
Regarding the "number of characters not adding up" part: It used to be true that one char was one byte, but you can't count on that any more. My guess is that var_dump() shows how many bytes the string is internally, which you really should not care about.
When working with strings in high-level languages you really should concentrate on number of characters and forget how many bytes a string is. (there are exceptions, of course ;)