正则表达式模式查找所有没有直接点字符的数字

发布于 2025-01-16 23:58:14 字数 1119 浏览 2 评论 0原文

你们中的任何人都可以帮我为以下要求编写正则表达式模式吗？

没有数字的部分标签
后面没有点字符的所有部分标签数字。
仅考虑更接近部分标记的数字。

测试字符串：

<sectionb>2.3. Optimized test sentence<op>(</op>1,1<cp>)</cp></sectionb>
*<sectiona>2 Surface Model: ONGV<op>(</op>1,1<cp>)</cp></sectiona>*
<sectiona>3. Verification of MKJU<op>(</op>1,1<cp>)</cp> Entity</sectiona>
*<sectionc>3. 2. 1 <txt>Case 1</txt> Annual charges to SGX</sectionc>*
*<sectiona>Compound Interest<role>back</role></sectiona>*

模式：

<section[a-z]>[\d]*[^\.]*<\/section[a-z]

正则表达式模式应匹配以下字符串：

<sectiona>2 Surface Model: ONGV<op>(</op>1,1<cp>)</cp></sectiona>
<sectionc>3. 2 1 <txt>Case 1</txt> Annual charges to SGX</sectionc>
<sectiona>Compound Interest<role>back</role></sectiona>

原文

Can any of you please help me to write a regex pattern for the below requirement?

Section tags that don't have numbers
All section tag numbers that don't have a dot character followed by.
Numbers that are closer to the section tag only that to be considered.

Test String:

<sectionb>2.3. Optimized test sentence<op>(</op>1,1<cp>)</cp></sectionb>
*<sectiona>2 Surface Model: ONGV<op>(</op>1,1<cp>)</cp></sectiona>*
<sectiona>3. Verification of MKJU<op>(</op>1,1<cp>)</cp> Entity</sectiona>
*<sectionc>3. 2. 1 <txt>Case 1</txt> Annual charges to SGX</sectionc>*
*<sectiona>Compound Interest<role>back</role></sectiona>*

Pattern:

<section[a-z]>[\d]*[^\.]*<\/section[a-z]

Regex Pattern Should Match the below string:

<sectiona>2 Surface Model: ONGV<op>(</op>1,1<cp>)</cp></sectiona>
<sectionc>3. 2 1 <txt>Case 1</txt> Annual charges to SGX</sectionc>
<sectiona>Compound Interest<role>back</role></sectiona>

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

如果没结果 2025-01-23 23:58:14

这符合更新的要求：

<section\w+>(((\d+\.\s*)*(\d+[^\.]))|[^\d]).*?<\/section\w>

\w 与 [az] 与 + 基本相同 允许 0 个或多个 (

)，删除 + 仅包含一个字母

(\d+\.\s*)* 0 个或更多数字/点/任意数量的空格 - 匹配更新后的第 3 行，现在是 3。 2. 1 点后有空格

(\d+[^\.]) 必须匹配不带点的数字，一位或多位数字

((...)|[^ \d]) 或节不以数字开头（匹配第 5 行）

.*? 后跟任何字符，尽可能少出现以下 /section - 可能会通过展望来做到这一点简化正则表达式，但是，对我来说，这使单独的“无数字”子句保持独立。

正则表达式101

This matches the updated requirements:

<section\w+>(((\d+\.\s*)*(\d+[^\.]))|[^\d]).*?<\/section\w>

<section\w+> \w is mostly the same as [a-z] with + to allow for 0 or more (<section> <sectionabc>), remove + for exactly one letter

(\d+\.\s*)* 0 or more digit/dot/any number of spaces - match updated row 3 where it's now 3. 2. 1 with spaces after dots

(\d+[^\.]) must match digit without a dot, one or more digits

((...)|[^\d]) or section does not start with a digit (match row 5)

.*? followed by any character, as few as times as possible upto the following </section - could likely do this with a look ahead to simplify the regex, but, for me, this keeps the separate "no digits" clause separate.

regex101

回复收藏 0 原文

~没有更多了~