在 PHP 的正则表达式中使用 unicode

发布于 2024-12-23 06:45:06 字数 387 浏览 6 评论 0原文

我正在尝试将字符 Ö (U+00D6) 添加到下面的正则表达式中。显然出了问题，因为它在我的 preg_match 函数上不起作用。

工作正常的正则表达式：

/^([A-Z]{1})[a-z]{1,31}$/

应该工作但不起作用的正则表达式：

/^([A-Z\x{00D6}]{1})[a-z]{1,31}$/

我显然正在尝试创建一个正则表达式，该正则表达式以大写字母开头，用 Ö 扩展，后跟小写字母。总共，字符串的长度必须在 2-32 之间。包含 Ö 的 Unicode 表达式的正则表达式有什么问题？

原文

I'm trying to add the character Ö (U+00D6) to my regular expression below. Apparently something is going wrong because it's not working on my preg_match function.

The regular expression that works fine:

/^([A-Z]{1})[a-z]{1,31}$/

The one that should work but does not:

/^([A-Z\x{00D6}]{1})[a-z]{1,31}$/

I'm obviously trying to create a regular expression that is started with an uppercase letter extended with Ö and followed by lowercase letters. In total, the length of the string must be between 2-32. What is wrong with the regular expression that contains the Unicode expression for Ö?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

恰似旧人归 2024-12-30 06:45:06

\x{00D6} 将仅匹配单字节 \xD6 符号。当您将字符串传递给 preg_match 时，它很可能以 UTF-8 编码，这是\xC3 \x96。

您需要使用 /u正则表达式的修饰符来支持它。

/^([A-Z\x{00D6}]{1})[a-z]{1,31}$/u

此外，{1} 是装饰性的，但是多余的。

The \x{00D6} will only match the single byte \xD6 symbol. When you pass in a string to preg_match it's however most likely encoded in UTF-8, which is \xC3 \x96.

You need to use the /u modifier for your regex to support that.