\w 不够匹配，我应该用什么来代替？

发布于 2024-11-09 17:08:49 字数 734 浏览 2 评论 0原文

（在 PHP 中）我有以下字符串：

$string = '<!--:fr--><p>Mamá lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc ut est et tortor sagittis auctor id ut urna. Etiam quañ justo, pharetra sed bibendum at, vulputate et augue.</p> <p>Curabitur cursus mi vel quam placerat malesuada. Fusce euismod mollis tincidunt. Sed cursus, sem et porta dictum, elit purus facilisis massa, eget consectetur nisi libero eget leo. Vivamus vitae mattis nulla. varius fermentum.</p><!--:-->'

我想消除和 using

preg_replace('/<!--:[a-z]{2}-->(\w+)<!--:-->/', '${1}', $string)

但它返回相同的 $string。问题是什么？

原文

(in PHP) I have the following string:

$string = '<!--:fr--><p>Mamá lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc ut est et tortor sagittis auctor id ut urna. Etiam quañ justo, pharetra sed bibendum at, vulputate et augue.</p> <p>Curabitur cursus mi vel quam placerat malesuada. Fusce euismod mollis tincidunt. Sed cursus, sem et porta dictum, elit purus facilisis massa, eget consectetur nisi libero eget leo. Vivamus vitae mattis nulla. varius fermentum.</p><!--:-->'

And I wanna eliminate  and  using

preg_replace('/<!--:[a-z]{2}-->(\w+)<!--:-->/', '${1}', $string)

But it return the same $string. What is the problem?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

谎言 2024-11-16 17:08:49

您的字符不在 [a-zA-Z0-9_] 范围内（即 \w 匹配的字符）。您可以使用[\s\S]进行匹配，这意味着任何空白或非空白字符（即所有内容）。

您还可以将 . 与 s 标志一起使用。

试试这个...

preg_replace('/<!--:[a-z]{2}-->([\s\S]+?)<!--:-->/', '${1}', $string);

Ideone。

You have characters that fall outside of [a-zA-Z0-9_] (which is what \w matches). You can match with [\s\S], which means any whitespace or non whitespace character (i.e. everything).

You could also use . with s flag.

Try this...

preg_replace('/<!--:[a-z]{2}-->([\s\S]+?)<!--:-->/', '${1}', $string);

Ideone.

回复收藏 0 原文

别想她 2024-11-16 17:08:49

另一种可能性是您只是删除了不需要的部分。

preg_replace('/<!--:(?:[a-z]{2})?-->/', '', $string);

这仅匹配您不需要的部分其中 (?:[az]{2}) ? 是两个可选的小写字母，这意味着它将匹配两个部分。

The other possibility is that you just remove the part you don't want.

preg_replace('/<!--:(?:[a-z]{2})?-->/', '', $string);

This matches only your not wanted part  where the (?:[a-z]{2})? is two optional lowercase letters, that means it will match both parts.

回复收藏 0 原文

混吃等死 2024-11-16 17:08:49

要解决您的问题，您只需要一个简单的正则表达式，例如和 PHP 代码，例如：

$string = preg_replace('/<!--:(fr)?-->/', '', $string);

要回答问题： \w< /code> 是一个非常有限且不推荐的快捷方式。例如，它不会匹配您输入的ñ，也不会匹配,。 PHP 对 Unicode 有很好的支持。快捷方式 \p{L} 匹配任何语言的任何字母。还有任何标点符号等的快捷方式。这些可以组合在字符类中。例如，如果您想以任何顺序匹配至少一个字母（包括法语和西班牙语字母）、点或逗号，您可以这样写：

[\p{L}.,]+

这里有一些关于其工作原理的信息：

http://www.regular-expressions.info/unicode.html

To solve your problem, you only need a simple regex like  and a PHP code like:

$string = preg_replace('/<!--:(fr)?-->/', '', $string);

To answer the question: \w is a very limited and not recommended shortcut. It will e.g. not match ñ from your input and neither will it match ,. PHP has good support for Unicode. The shortcut \p{L} match any letter from any language. There are also shortcuts for any punctuation etc. These can be combined in a character class. E.g. if you want to match at least one letter (including French and Spanish letters), dot or comma in any sequence, you can write: