preg_replace,字符转义&带重音的字符。 /u 在一台服务器上工作,但在另一台服务器上不起作用
我有以下代码:
preg_replace('/[^\w-]/u','.','Bréánná MÓÚLÍN');
在服务器 A (PHP 5.3.5) 上返回:
“Bréánná.Móúlín”(应该如此)
但是,在服务器 B (PHP 5.2.11) 上它返回:
“Br..n..M..ln”(根本不是我想要的)
我是否正确地认为这取决于编译整个过程时是否设置了PCRE_UCP?
如果是这种情况,有什么办法可以覆盖这个吗?
如果做不到这一点,是否有任何方法可以轻松地将这些字符替换为“标准”等效字符? (类似于 utf8_decode 但更广泛)
I have the following code:
preg_replace('/[^\w-]/u','.','Bréánná MÓÚLÍN');
Which on server A (PHP 5.3.5) returns:
"Bréánná.Móúlín" (as it should)
However, on server B (PHP 5.2.11) it returns:
"Br..n..M..l.n" (not what what I want at all)
Am I right in thinking that this is down to whether or not PCRE_UCP was set when the whole thing was compiled?
Is there any way of overriding this if this is the case?
Failing that, is there any way of easily replacing such characters with a 'standard' equivalent? (Like utf8_decode but more expansive)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我不确定编译期间定义的
PCRE_UCP
是否会影响preg_replace()
,但解决问题的方法是使用多字节字符串函数mb_ereg_replace()
:PHP 5.2 结果:http://codepad.viper-7.com/UnZeyf
编辑: 我最初认为多字节 ereg 函数支持 Unicode 字符类型转义,但事实证明并非如此 真的。相反,您需要确定您认为“字母”的字符范围。我使用了 XML 标准对
NameChar
的定义中的字符范围< /a> 使用以下 Java 程序生成 RegExp 字符串(显然多字节 ereg 函数也不支持 Unicode 字符转义序列):I am not sure whether
PCRE_UCP
defined during compilation affectspreg_replace()
, but a work-around to your problem is to use the multibyte string functionmb_ereg_replace()
:PHP 5.2 results: http://codepad.viper-7.com/UnZeyf
EDIT: I originally thought that the multibyte ereg functions supported Unicode character type escapes, but this turns out not to be true. Instead, you need to determine the ranges of characters that you consider "letters". I used the character ranges from the XML Standard's definition of
NameChar
with the following Java program to generate the RegExp string (as apparently the multibyte ereg functions do not support Unicode character escape sequences, either):