在 PHP 的正则表达式中使用 unicode
我正在尝试将字符 Ö
(U+00D6) 添加到下面的正则表达式中。显然出了问题,因为它在我的 preg_match
函数上不起作用。
工作正常的正则表达式:
/^([A-Z]{1})[a-z]{1,31}$/
应该工作但不起作用的正则表达式:
/^([A-Z\x{00D6}]{1})[a-z]{1,31}$/
我显然正在尝试创建一个正则表达式,该正则表达式以大写字母开头,用 Ö
扩展,后跟小写字母。总共,字符串的长度必须在 2-32 之间。包含 Ö
的 Unicode 表达式的正则表达式有什么问题?
I'm trying to add the character Ö
(U+00D6) to my regular expression below. Apparently something is going wrong because it's not working on my preg_match
function.
The regular expression that works fine:
/^([A-Z]{1})[a-z]{1,31}$/
The one that should work but does not:
/^([A-Z\x{00D6}]{1})[a-z]{1,31}$/
I'm obviously trying to create a regular expression that is started with an uppercase letter extended with Ö
and followed by lowercase letters. In total, the length of the string must be between 2-32. What is wrong with the regular expression that contains the Unicode expression for Ö
?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
\x{00D6}
将仅匹配单字节\xD6
符号。当您将字符串传递给preg_match
时,它很可能以 UTF-8 编码,这是\xC3 \x96
。您需要使用
/u
正则表达式的修饰符来支持它。此外,
{1}
是装饰性的,但是多余的。The
\x{00D6}
will only match the single byte\xD6
symbol. When you pass in a string topreg_match
it's however most likely encoded in UTF-8, which is\xC3 \x96
.You need to use the
/u
modifier for your regex to support that.Also the
{1}
is decorative, but redundant.