如何让这个正则表达式工作?
我有一个小问题,我想在
3foo2
中 找到foo,我使用:
$\d(.*)$
来查找 foo,但它不起作用因为它与 foo 末尾的 不匹配,但与字符串末尾的
匹配
i have a small problem, i want to find in
<tr><td>3</td><td>foo</td><td>2</td>
the foo, i use:
$<tr><td>\d</td><td>(.*)</td>$
to find the foo, but it dont work because it dont match with the </td>
at the end of foo but with the </td>
at the end of the string
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
你必须让
.*
变得懒惰而不是贪婪。 此处了解有关惰性与贪婪的更多信息。您的字符串锚点结尾 (
$
) 也没有意义。尝试:(如 rubular 所示。)
注意:我不提倡使用正则表达式来解析HTML。但有时,手头的任务足够简单,可以通过正则表达式来处理,而成熟的 XML 解析器对此来说就太过分了(例如:这个问题)。知道选择“适合工作的工具”是编程的一项重要技能。
You have to make the
.*
lazy instead of greedy. Read more about lazy vs greedy here.Your end of string anchors (
$
) also don't make sense. Try:(As seen on rubular.)
NOTE: I don't advocate using regex to parse HTML. But some times the task at hand is simple enough to be handled by regex, for which a full-blown XML parser is overkill (for example: this question). Knowing to pick the "right tool for the job" is an important skill in programming.
您的前导
$
应该是^
。如果您不想一直匹配到字符串末尾,请不要在末尾使用
$
。然而,由于*
是贪婪的,它会尽可能多地获取。某些正则表达式实现具有可以工作的非贪婪版本,但您可能只想将(.*)
更改为([^<]*)
。Your leading
$
should be a^
.If you don't want to match all of the way to the end of the string, don't use a
$
at the end. However, since*
is greedy, it'll grab as much as it can. Some regex implementations have a non-greedy version which would work, but you probably just want to change(.*)
to([^<]*)
.使用:(
插入关于不使用正则表达式解析 xml 的强制注释)
Use:
(insert obligatory comment about not using regex to parse xml)