我有一个带有“\u00a0”的字符串,我需要将其替换为“” str_replace 失败

发布于 2024-08-28 04:01:01 字数 1429 浏览 7 评论 0原文

我需要清理来自(复制/粘贴)来自各种 Microsoft Office 套件应用程序的字符串(Excel访问Word),每个都有自己的一套编码。

我使用 json_encode 进行调试,以便能够看到每个编码字符。

我可以使用 str_replace 清理迄今为止发现的所有内容 (\r \n),但使用 \u00a0 则运气不佳。

$string = '[email protected]\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0;[email protected]'; //this is the output from json_encode

$clean = str_replace("\u00a0", "",$string);

返回:

[email protected]\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0;[email protected]

完全一样;它完全忽略\u00a0。

有办法解决这个问题吗?另外,我觉得我正在重新发明轮子,是否有一个函数/类可以完全剥离所有可能的编码的每个可能的字符?

____编辑____

在前两个回复之后,我需要澄清我的示例确实有效,因为它是 json_encode 的输出,而不是实际的字符串!

I need to clean a string that comes (copy/pasted) from various Microsoft Office suite applications (Excel, Access, and Word), each with its own set of encoding.

I'm using json_encode for debugging purposes in order to being able to see every single encoded character.

I'm able to clean everything I found so far (\r \n) with str_replace, but with \u00a0 I have no luck.

$string = '[email protected]\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0;[email protected]'; //this is the output from json_encode

$clean = str_replace("\u00a0", "",$string);

returns:

[email protected]\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0;[email protected]

That is exactly the same; it completely ignores \u00a0.

Is there a way around this? Also, I'm feeling I'm reinventing the wheel, is there a function/class that completely strips EVERY possibile char of EVERY possible encoding?

____EDIT____

After the first two replies I need to clarify that my example DOES work, because it's the output from json_encode, not the actual string!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

七度光 2024-09-04 04:01:01

通过在包含 \u00a0 的字符串上将 ord()substr() 结合使用,我发现以下诅咒有效:

$text = str_replace( chr( 194 ) . chr( 160 ), ' ', $text );

By combining ord() with substr() on my string containing \u00a0, I found the following curse to work:

$text = str_replace( chr( 194 ) . chr( 160 ), ' ', $text );
各自安好 2024-09-04 04:01:01

试试这个:

$str = str_replace("\u{00a0}", ' ', $str);

Try this:

$str = str_replace("\u{00a0}", ' ', $str);
心房敞 2024-09-04 04:01:01

当我复制/粘贴您的代码时,对我有用。尝试将 str_replace() 中的双引号替换为单引号,或转义反斜杠 ("\\u00a0")。

Works for me, when I copy/paste your code. Try replacing the double quotes in your str_replace() with single quotes, or escaping the backslash ("\\u00a0").

醉酒的小男人 2024-09-04 04:01:01

我刚刚遇到了同样的问题。显然,PHP 的 json_encode 将为任何包含“不间断空格”的字符串返回 null。

解决方案是将其替换为常规空间:

str_replace(chr(160),' ');

我希望这对某人有帮助 - 我花了一个小时才弄清楚。

I just had the same problem. Apparently PHP's json_encode will return null for any string with a 'non-breaking space' in it.

The Solution is to replace this with a regular space:

str_replace(chr(160),' ');

I hope this helps somebody - it took me an hour to figure out.

节枝 2024-09-04 04:01:01

这个也有效,我在某处找到的

$str = trim($str, chr(0xC2).chr(0xA0));

This one also works, i found somewhere

$str = trim($str, chr(0xC2).chr(0xA0));
携君以终年 2024-09-04 04:01:01

一个小点: \u00a0 实际上是一个不间断的空格字符,参见 http ://www.fileformat.info/info/unicode/char/a0/index.htm

因此将其替换为“”可能更正确

A minor point: \u00a0 is actually a non-breaking space character, c.f. http://www.fileformat.info/info/unicode/char/a0/index.htm

So it might be more correct to replace it with " "

眼泪淡了忧伤 2024-09-04 04:01:01

这对我来说很有效:

$str = preg_replace( "~\x{00a0}~siu", " ", $str );

This did the trick for me:

$str = preg_replace( "~\x{00a0}~siu", " ", $str );
最冷一天 2024-09-04 04:01:01

您可以使用 json_encode($string, JSON_UNESCAPED_UNICODE |JSON_PRETTY_PRINT);

You can use json_encode($string, JSON_UNESCAPED_UNICODE |JSON_PRETTY_PRINT);

吃不饱 2024-09-04 04:01:01

您必须使用单引号来执行此操作,如下所示:

str_replace('\u00a0', "",$string);

或者,如果您喜欢使用双引号,则必须转义反斜杠 - 如下所示:

str_replace("\\u00a0", "",$string);

You have to do this with single quotes like this:

str_replace('\u00a0', "",$string);

Or, if you like to use double quotes, you have to escape the backslash - which would look like this:

str_replace("\\u00a0", "",$string);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文