php 正则表达式的任何字符表示法
在我的正则表达式中,我想说在示例文本中,允许使用任何字符,包括大写和小写的 az、数字和特殊字符。
例如,我的正则表达式可能会检查文档是否为 html。因此:
"/\n<html>[]+</html>\n/"
我尝试过 []+ 但它似乎不喜欢这样?
In my regex, I want to say that within the sample text, any characters are allowed, including a-z in upper and lower case, numbers and special characters.
For example, my regular expression may be checking that a document is html. therefore:
"/\n<html>[]+</html>\n/"
i have tried []+ but it does not seem to like this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
使用
[XXX]+
表示[
和]
之间的任何字符一次或多次。在这里,您没有在
[
和]
之间放置任何字符 - 因此出现了问题。If you want to say "any possible character", you can use a `.`
Note : by default, it will not match newlines ; you'll have to play with [**Pattern Modifiers**][1] if you want it to.
如果您想说出任何字母,您可以使用 :
[az]
[AZ]
[a- zA-Z]
并且,对于数字:
[0-9]
:任何数字[a-zA-Z0-9]
:任何小写或大写字母和任何数字。At that point, you will probably want to take a look at :
\w
元字符,表示“任何单词字符”,After that, when you'll begin using a regex such as
应匹配:
您会发现它在您期望时也不会“停止”——那是因为默认情况下匹配是贪婪的——您必须使用
?
在+
之后,或使用U
修饰符;请参阅重复部分,了解更多信息。Well, actually, the best thing to do would be to *invest* some time, carefully reading everything in the [**PCRE Patterns**][4] section of the manual, if you want to start working with regexes ;-)
Oh, and, BTW : **using regex to *parse* HTML is a bad idea...**
通常使用 DOM 解析器会更好,例如:
DOMDocument::loadHTML
Using
[XXX]+
means any character that's between[
and]
, one or more than one time.Here, you didn't put any character between
[
and]
-- hence the problem.If you want to say "any possible character", you can use a `.`
Note : by default, it will not match newlines ; you'll have to play with [**Pattern Modifiers**][1] if you want it to.
If you want to say any letter, you can use :
[a-z]
[A-Z]
[a-zA-Z]
And, for numbers :
[0-9]
: any digit[a-zA-Z0-9]
: any lower-case or upper-case letter, and any number.At that point, you will probably want to take a look at :
\w
meta-character, which means "any word character"After that, when you'll begin using a regex such as
which should match :
You'll see that it doesn't "stop" when you expect it too -- that's because matching is greedy, by default -- you'll have to use a
?
after the+
, or use theU
modifier ; see the Repetition section, for more informations.Well, actually, the best thing to do would be to *invest* some time, carefully reading everything in the [**PCRE Patterns**][4] section of the manual, if you want to start working with regexes ;-)
Oh, and, BTW : **using regex to *parse* HTML is a bad idea...**
It's generally much better to use a DOM Parser, such as :
DOMDocument::loadHTML
点
.
是“任何字符”的元字符the dot
.
is the meta character for "any character"