为什么我的 C# 正则表达式在行之间不匹配?
我在 C# 中有以下正则表达式:
Regex h1Separator = new Regex(@"<h1>(?'name'[\w\d\s]+?)(<br\s?/?>)?</h1>", RegexOptions.Singleline);
尝试匹配如下所示的字符串:
<h1>test content<br>
</h1>
现在它匹配如下所示的字符串:
<h1>test content<br></h1>
<h1>test content</h1>
我做错了什么?我应该匹配换行符吗?如果是的话,C# 中的是什么?我找不到一个。
I have the following Regex in C#:
Regex h1Separator = new Regex(@"<h1>(?'name'[\w\d\s]+?)(<br\s?/?>)?</h1>", RegexOptions.Singleline);
Trying to match a string that looks like this:
<h1>test content<br>
</h1>
right now it matches strings that look like the following:
<h1>test content<br></h1>
<h1>test content</h1>
What am I doing wrong? Should I be matching for a newline character? If so, what is it in C#? I can't find one.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您不检查 br 标签末尾和下一个标签开头之间的空格,因此它希望紧随其后看到 hr 标签。在中间添加一个 \s* 以允许这样做。
You don't check for whitespace between the end of the br tag and the start of the next tag, so it expects to see the hr tag immediately after. Add a \s* in between to allow that.
您将其定义为单行正则表达式,请参阅
RegexOptions.Singleline
标志:) 使用RegexOptions.Multiline
You have it defined as a single line regex, see the
RegexOptions.Singleline
flag :) useRegexOptions.Multiline
C# 中的换行符是:
\n
。但是,我不熟悉正则表达式,无法告诉您如果正则表达式中有换行符会发生什么。The newline character in C# is:
\n
. However, I am not skilled in regex and couldn't tell you what would happen if there was a newline in a regex expression.您可以在结束
之前在字符串中添加一个点
.
并保留RegexOptions.Singleline
选项,或者更改它到RegexOptions.Multiline
并在正则表达式中的之前添加一个
$
。详细信息此处you can either add a dot
.
to your string before the ending</h1>
and keep theRegexOptions.Singleline
option, or change it toRegexOptions.Multiline
and add a$
to the regex before the</h1>
. details here使用多行标志。 (编辑以解决我对 .Net 平台的误解)。
单行模式将您传入的整个字符串视为一个条目。因此
^
和$
表示整个字符串,而不是字符串中一行的开头和结尾。示例(?'name'[\w\d\s]+?)(
)?
将匹配以下内容:
多行模式将
^
和$
的含义更改为字符串中每行的开头和结尾(即它们将查看每个换行符)。将匹配所需的模式:
简而言之,您需要告诉正则表达式解析器您希望使用多行。拥有一个会说您的正则表达式方言的正则表达式设计器会很有帮助。有许多。
Use the Multiline flag. (Edit to address my mispeaking about the .Net platform).
Singleline mode treats the entire string you are passing in as one entry. Therefore
^
and$
represent the entire string and not the beginning and ending of a line within the string. Example<h1>(?'name'[\w\d\s]+?)(<br\s?/?>)?</h1>
will match this:Multiline mode changes the meaning of
^
and$
to the beginning and ending of each line within the string (i.e. they will look at every line break).will match the desired pattern:
In short, you need to tell the regex parser you expect to work with multiple lines. It helps to have a regex designer that speaks your dialect of regex. There are many.