带有俄语语言的正则表达式

发布于 2024-08-07 15:37:05 字数 556 浏览 3 评论 0原文

我无法用正则表达式解决我的问题。

好的，当我输入：

$string = preg_replace("#\[name=([a-zA-Z0-9 .-]+)*]#","$name_start $1 $name_end",$string);

一切都很好，除了俄语的情况。

所以，我尝试重新输入这个reg-exp：

$string = preg_replace("#\[name=([a-zA-Z0-9**а-яА-Я** .-]+)*]#","$name_start $1 $name_end",$string);

但这不起作用，

我知道一些想法，只需写：

$string = preg_replace("#\[name=([a-zA-Z0-9йцукенгшщзхъфывапролджэячсмитьбю .-]+)*]#","$name_start $1 $name_end",$string);

但这太疯狂了：D

请给我简单的变体

原文

I can't solve my problem with regexp.

Ok, when i type:

$string = preg_replace("#\[name=([a-zA-Z0-9 .-]+)*]#","$name_start $1 $name_end",$string);

everything is ok, except situation with Russian language.

so, i try to re-type this reg-exp:

$string = preg_replace("#\[name=([a-zA-Z0-9**а-яА-Я** .-]+)*]#","$name_start $1 $name_end",$string);

but this not working,

i know some idea, just write:

$string = preg_replace("#\[name=([a-zA-Z0-9йцукенгшщзхъфывапролджэячсмитьбю .-]+)*]#","$name_start $1 $name_end",$string);

but this is crazy :D

please, give me simple variant

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

请你别敷衍 2024-08-14 15:37:05

尝试 Unicode 范围：

'/[\x{0410}-\x{042F}]/u'  // matches a capital cyrillic letter in the range A to Ya

不要忘记 Unicode 的 /u 标志。

在您的情况下：

"#\[name=([a-zA-Z0-9\x{0430}-\x{044F}\x{0410}-\x{042F} .-]+)*]#u"

请注意，正则表达式中的 STAR 是多余的。一切都已经被PLUS“吃掉”了。这会起到同样的作用：

"#\[name=([a-zA-Z0-9\x{0430}-\x{044F}\x{0410}-\x{042F} .-]+)]#u"

Try a Unicode range:

'/[\x{0410}-\x{042F}]/u'  // matches a capital cyrillic letter in the range A to Ya

Don't forget the /u flag for Unicode.

In your case:

"#\[name=([a-zA-Z0-9\x{0430}-\x{044F}\x{0410}-\x{042F} .-]+)*]#u"

Note that the STAR in your regex is redundant. Everything already gets "eaten" by the PLUS. This would do the same:

"#\[name=([a-zA-Z0-9\x{0430}-\x{044F}\x{0410}-\x{042F} .-]+)]#u"

回复收藏 0 原文

看海 2024-08-14 15:37:05

通用 unicode 脚本（自 pcre 3.3 起受支持）提供了对属性 Cyrillic 的测试。

例如，替换所有既不是西里尔字母也不是（拉丁）数字的字符：

$string = '1a2b3cйdцeуfкбxюy';
echo preg_replace('/[^0-9\p{Cyrillic}]/u', '*', $string);

您可以在 http 下找到该功能的文档： //www.pcre.org/pcre.txt“Unicode 字符属性”。
并且您必须指定 PCRE8 标志 (u)，如 http://docs.php 中所述.net/reference.pcre.pattern.modifiers

The common unicode script (supported since pcre 3.3) provides a test for the property Cyrillic.

e.g. replace all characters that are neither cyrillic nor (latin) digits:

$string = '1a2b3cйdцeуfкбxюy';
echo preg_replace('/[^0-9\p{Cyrillic}]/u', '*', $string);

You can find the documentation for that feature under http://www.pcre.org/pcre.txt "Unicode character properties".
And you have to specify the PCRE8 flag (u) as described at http://docs.php.net/reference.pcre.pattern.modifiers

回复收藏 0 原文

波浪屿的海角声 2024-08-14 15:37:05

这个对我有用：

/^[а-яА-Я\p{Cyrillic}0-9\s\-]+$/

我已经在包括 Safari 在内的所有浏览器中进行了测试

This one worked for me:

/^[а-яА-Я\p{Cyrillic}0-9\s\-]+$/

I have tested in all the browsers including Safari

回复收藏 0 原文

愿得七秒忆 2024-08-14 15:37:05

互联网上最常用的字母之一。

我相信从 php 5.6 开始，这个功能已经工作了一段时间了。

// Filter Chinese and Japanese HAN
if (preg_match("/\p{Han}+/u", " 余TEST杭丽人广播", $match)){echo "CHINESE, JAPANESE ";}
// Filter Cyrilic
if (preg_match("/\p{Cyrillic}/u", "Күңел радиосы ", $match)){echo "RUSSIAN ";}
// Filter Greek
if (preg_match("/\p{Greek}/u", "Πρακτορείο ", $match)){echo "GREEK ";}
// Filter Arabic
if (preg_match("/\p{Arabic}/u", "مشال راډیو", $match)){echo "ARABIC ";}
// Filter Armenian
if (preg_match("/\p{Armenian}/u", "Ազատություն ", $match)){echo "ARMENIAN ";}
// Filter Thai
if (preg_match("/\p{Thai}/u", "สวท.พะเยา", $match)){echo "THAI ";}
// Filter Georgian
if (preg_match("/\p{Georgian}/u", "რადიო თავისუფალი", $match)){echo "GEORGIAN";}

/* Output: */
/* CHINESE, JAPANESE RUSSIAN GREEK ARABIC ARMENIAN THAI GEORGIAN */

Among the most used alphabet in the internet.

This works since a good while now, I believe since php 5.6.

// Filter Chinese and Japanese HAN
if (preg_match("/\p{Han}+/u", " 余TEST杭丽人广播", $match)){echo "CHINESE, JAPANESE ";}
// Filter Cyrilic
if (preg_match("/\p{Cyrillic}/u", "Күңел радиосы ", $match)){echo "RUSSIAN ";}
// Filter Greek
if (preg_match("/\p{Greek}/u", "Πρακτορείο ", $match)){echo "GREEK ";}
// Filter Arabic
if (preg_match("/\p{Arabic}/u", "مشال راډیو", $match)){echo "ARABIC ";}
// Filter Armenian
if (preg_match("/\p{Armenian}/u", "Ազատություն ", $match)){echo "ARMENIAN ";}
// Filter Thai
if (preg_match("/\p{Thai}/u", "สวท.พะเยา", $match)){echo "THAI ";}
// Filter Georgian
if (preg_match("/\p{Georgian}/u", "რადიო თავისუფალი", $match)){echo "GEORGIAN";}

/* Output: */
/* CHINESE, JAPANESE RUSSIAN GREEK ARABIC ARMENIAN THAI GEORGIAN */

回复收藏 0 原文

~没有更多了~