php 正则表达式的任何字符表示法

发布于 2024-08-27 19:26:20 字数 185 浏览 4 评论 0原文

在我的正则表达式中,我想说在示例文本中,允许使用任何字符,包括大写和小写的 az、数字和特殊字符。

例如,我的正则表达式可能会检查文档是否为 html。因此:

"/\n<html>[]+</html>\n/"

我尝试过 []+ 但它似乎不喜欢这样?

In my regex, I want to say that within the sample text, any characters are allowed, including a-z in upper and lower case, numbers and special characters.

For example, my regular expression may be checking that a document is html. therefore:

"/\n<html>[]+</html>\n/"

i have tried []+ but it does not seem to like this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

狼性发作 2024-09-03 19:26:21

使用 [XXX]+ 表示 [] 之间的任何字符一次或多次。

在这里,您没有在 [] 之间放置任何字符 - 因此出现了问题。

If you want to say "any possible character", you can use a `.`

Note : by default, it will not match newlines ; you'll have to play with [**Pattern Modifiers**][1] if you want it to.

如果您想说出任何字母,您可以使用 :

  • 表示小写字母:[az]
  • 表示大写字母:[AZ]
  • 表示两者:[a- zA-Z]

并且,对于数字:

  • [0-9] :任何数字
  • [a-zA-Z0-9] :任何小写或大写字母和任何数字。

At that point, you will probably want to take a look at :

  • 反斜杠 部分PCRE 手册
  • 特别是 \w 元字符,表示“任何单词字符”,

After that, when you'll begin using a regex such as

/.+/s

应匹配:

  • 任何可能的字符
    • 包括换行符
  • 一次或多次

您会发现它在您期望时也不会“停止”——那是因为默认情况下匹配是贪婪的——您必须使用 ?+ 之后,或使用 U 修饰符;请参阅重复部分,了解更多信息。

Well, actually, the best thing to do would be to *invest* some time, carefully reading everything in the [**PCRE Patterns**][4] section of the manual, if you want to start working with regexes ;-)

Oh, and, BTW : **using regex to *parse* HTML is a bad idea...**

通常使用 DOM 解析器会更好,例如:

Using [XXX]+ means any character that's between [ and ], one or more than one time.

Here, you didn't put any character between [ and ] -- hence the problem.

If you want to say "any possible character", you can use a `.`

Note : by default, it will not match newlines ; you'll have to play with [**Pattern Modifiers**][1] if you want it to.

If you want to say any letter, you can use :

  • for lower case : [a-z]
  • for upper-case : [A-Z]
  • for both : [a-zA-Z]

And, for numbers :

  • [0-9] : any digit
  • [a-zA-Z0-9] : any lower-case or upper-case letter, and any number.

At that point, you will probably want to take a look at :

  • The Backslash section of the PCRE manual
  • And, especially, the \w meta-character, which means "any word character"

After that, when you'll begin using a regex such as

/.+/s

which should match :

  • Any possible character
    • Including newlines
  • One or more time

You'll see that it doesn't "stop" when you expect it too -- that's because matching is greedy, by default -- you'll have to use a ? after the +, or use the U modifier ; see the Repetition section, for more informations.

Well, actually, the best thing to do would be to *invest* some time, carefully reading everything in the [**PCRE Patterns**][4] section of the manual, if you want to start working with regexes ;-)

Oh, and, BTW : **using regex to *parse* HTML is a bad idea...**

It's generally much better to use a DOM Parser, such as :

世态炎凉 2024-09-03 19:26:21

. 是“任何字符”的元字符

the dot . is the meta character for "any character"

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文