正则表达式模式查找所有没有直接点字符的数字
你们中的任何人都可以帮我为以下要求编写正则表达式模式吗?
- 没有数字的部分标签
- 后面没有点字符的所有部分标签数字。
- 仅考虑更接近部分标记的数字。
测试字符串:
<sectionb>2.3. Optimized test sentence<op>(</op>1,1<cp>)</cp></sectionb>
*<sectiona>2 Surface Model: ONGV<op>(</op>1,1<cp>)</cp></sectiona>*
<sectiona>3. Verification of MKJU<op>(</op>1,1<cp>)</cp> Entity</sectiona>
*<sectionc>3. 2. 1 <txt>Case 1</txt> Annual charges to SGX</sectionc>*
*<sectiona>Compound Interest<role>back</role></sectiona>*
模式:
<section[a-z]>[\d]*[^\.]*<\/section[a-z]
正则表达式模式应匹配以下字符串:
<sectiona>2 Surface Model: ONGV<op>(</op>1,1<cp>)</cp></sectiona>
<sectionc>3. 2 1 <txt>Case 1</txt> Annual charges to SGX</sectionc>
<sectiona>Compound Interest<role>back</role></sectiona>
Can any of you please help me to write a regex pattern for the below requirement?
- Section tags that don't have numbers
- All section tag numbers that don't have a dot character followed by.
- Numbers that are closer to the section tag only that to be considered.
Test String:
<sectionb>2.3. Optimized test sentence<op>(</op>1,1<cp>)</cp></sectionb>
*<sectiona>2 Surface Model: ONGV<op>(</op>1,1<cp>)</cp></sectiona>*
<sectiona>3. Verification of MKJU<op>(</op>1,1<cp>)</cp> Entity</sectiona>
*<sectionc>3. 2. 1 <txt>Case 1</txt> Annual charges to SGX</sectionc>*
*<sectiona>Compound Interest<role>back</role></sectiona>*
Pattern:
<section[a-z]>[\d]*[^\.]*<\/section[a-z]
Regex Pattern Should Match the below string:
<sectiona>2 Surface Model: ONGV<op>(</op>1,1<cp>)</cp></sectiona>
<sectionc>3. 2 1 <txt>Case 1</txt> Annual charges to SGX</sectionc>
<sectiona>Compound Interest<role>back</role></sectiona>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这符合更新的要求:
\w
与[az]
与+ 基本相同
允许 0 个或多个 (
),删除+
仅包含一个字母(\d+\.\s*)*
0 个或更多数字/点/任意数量的空格 - 匹配更新后的第 3 行,现在是3。 2. 1
点后有空格(\d+[^\.])
必须匹配不带点的数字,一位或多位数字((...)|[^ \d])
或节不以数字开头(匹配第 5 行).*?
后跟任何字符,尽可能少出现以下/section
- 可能会通过展望来做到这一点简化正则表达式,但是,对我来说,这使单独的“无数字”子句保持独立。正则表达式101
This matches the updated requirements:
<section\w+>
\w
is mostly the same as[a-z]
with+
to allow for 0 or more (<section>
<sectionabc>
), remove+
for exactly one letter(\d+\.\s*)*
0 or more digit/dot/any number of spaces - match updated row 3 where it's now3. 2. 1
with spaces after dots(\d+[^\.])
must match digit without a dot, one or more digits((...)|[^\d])
or section does not start with a digit (match row 5).*?
followed by any character, as few as times as possible upto the following</section
- could likely do this with a look ahead to simplify the regex, but, for me, this keeps the separate "no digits" clause separate.regex101