PHP preg_replace 不起作用/奇怪的字符串类型/字符数没有相加

发布于 2024-09-30 13:46:49 字数 600 浏览 3 评论 0原文

我想从字符串中删除以下指定的字符:

<
&
"
#
%

所以类似:

Test%#"&<value

应该变成:

Testvalue

字符的顺序无关紧要。


字符串类型有一些奇怪的地方。

测试字符串如下所示:

var_dump($test): string(25) "A BCDEFG/<&"#%/HI" 

字符数加起来不等于 25,我不知道为什么。


如果我这样做:

$displayName = strtr($displayName, array('<' => '', '&' => '', '"' => '', '#' => '', '%' => ''));

我得到:

 string(20) "A BCDEFG/lt;quot;/HI"

I want to remove the following specified characters from a string:

<
&
"
#
%

So something like:

Test%#"&<value

should become:

Testvalue

The order of characters is immaterial.


There's something weird about the string type.

A test string looks like this:

var_dump($test): string(25) "A BCDEFG/<&"#%/HI" 

The number of characters is NOT adding up to 25 and I'm not sure why.


If I do:

$displayName = strtr($displayName, array('<' => '', '&' => '', '"' => '', '#' => '', '%' => ''));

I get:

 string(20) "A BCDEFG/lt;quot;/HI"

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

如梦亦如幻 2024-10-07 13:46:49

转义 <工作

$displayName = preg_replace('/\<&"#%/', '', $displayName);

但是周围的 / 用作 preg_* 系列的分隔符。因此,您也必须执行以下操作来删除它们:(

$displayName = preg_replace('|/\<&"#%/|', '', $displayName);

这里我使用 | 作为分隔符,因为该字符不是表达式本身的一部分。)


编辑
如果您只想替换字符 <&"# 和 < code>%,这可能更可取:

$displayName = str_replace(array('<', '&', '"', '#', '%'), '', $displayName);

编辑 2
后来发生了很多混乱,似乎 $displayName 字符串实际上包含 A BCDEFG/<&"#%/HI。在这种情况下,您可以直接替换 HTML 实体(未经测试):

    $displayName = str_replace(array('<', '"', '&', '#', '%'), '', $displayName);

Escaping the < will work:

$displayName = preg_replace('/\<&"#%/', '', $displayName);

However the surrounding / serve as delimiters for the preg_* family. Therefore, you must do the following to remove these, too:

$displayName = preg_replace('|/\<&"#%/|', '', $displayName);

(Here I use | as delimiter instead since this character is not part of the expression itself.)


EDIT
If you're just interested in replacing the characters <, &, ", #, and %, this is probably preferable:

$displayName = str_replace(array('<', '&', '"', '#', '%'), '', $displayName);

EDIT 2
A great deal of confusion later, it seems that the $displayName string did actually contain A BCDEFG/<&"#%/HI. In that case you could replace the HTML entities directly (untested):

    $displayName = str_replace(array('<', '"', '&', '#', '%'), '', $displayName);
羁绊已千年 2024-10-07 13:46:49

如果你想用正则表达式来做到这一点:

<?php
$displayName = 'A BCDEFG/<&"#%/HI';
$displayName = preg_replace('/[\<&"#%]/', '', $displayName);
echo($displayName);
?>

这输出:

A BCDEFG//HI

If you want to do it with regular expressions:

<?php
$displayName = 'A BCDEFG/<&"#%/HI';
$displayName = preg_replace('/[\<&"#%]/', '', $displayName);
echo($displayName);
?>

This outputs:

A BCDEFG//HI
御弟哥哥 2024-10-07 13:46:49

关于“字符数未加起来”部分:过去确实是一个字符就是一个字节,但你不能再指望这一点了。我的猜测是 var_dump() 显示字符串内部有多少字节,您确实不应该关心这一点。

当使用高级语言处理字符串时,您确实应该专注于字符数量,而忘记字符串有多少字节。 (当然也有例外;)

Regarding the "number of characters not adding up" part: It used to be true that one char was one byte, but you can't count on that any more. My guess is that var_dump() shows how many bytes the string is internally, which you really should not care about.

When working with strings in high-level languages you really should concentrate on number of characters and forget how many bytes a string is. (there are exceptions, of course ;)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文