将正前瞻和负前瞻结合起来?
我不擅长正则表达式,但我有以下内容,但我假设以下部分内容意味着查找 13 - 16 位数字,然后如果找到 3 - 4 位数字则返回成功。问题是 3 - 4 位数字是可选的,它们也可以在 13 - 16 位数字之前,所以我想我想结合正向前看/向后看、向后向前看/向后看。这听起来很复杂,有没有更简单的方法?
(\d{13,16})[<"'].*?(?=[>"']\d{3,4}[<"'])[>"'](\d{3,4})[<"']
它将匹配以下代码片段中的 ccnum 和系列:
<CreditCard>
name="John Doe""
ccnum=""1111123412341231""
series="339"
exp="03/13">
</CreditCard>
但是,如果我删除 ccnum 或系列,它不会匹配任何内容,并且系列可以是可选的。此外,系列可以出现在 ccnum 之前或之后,因此如果我将系列属性放在 ccnum 属性之前,它也不匹配任何内容。如果我在 ccnum 之前有一个系列作为单独的元素,例如或者如果我忽略一个系列元素,它也不匹配:
<CreditCard>
<series>234</series>
<ccnum>1235583839293838</ccnum>
</CreditCard>
我需要正则表达式匹配以下场景,但我不知道元素的确切名称,在在这种情况下,我只是将它们称为 ccnum 和 series。
以下是有效的:
<CreditCard>
<ccnum>1235583839293838</ccnum>
<series>123</series>
</CreditCard>
<CreditCard ccnum="1838383838383833">
<series>123</series>
</CreditCard>
<CreditCard ccnum="1838383838383833" series="139"
</CreditCard>
它还应该匹配以下内容,但不匹配:
<CreditCard ccnum="1838383838383833"
</CreditCard>
<CreditCard series="139" ccnum="1838383838383833"
</CreditCard>
<CreditCard ccnum="1838383838383833"></CreditCard>
<CreditCard>
<series>123</series>
<ccnum>1235583839293838</ccnum>
</CreditCard>
<CreditCard>
<ccnum series="123">1235583839293838</ccnum>
</CreditCard>
现在,为了使其正常工作,我使用 3 个单独的正则表达式:
1匹配安全码之前的信用卡号。
1 匹配信用卡号码之前的安全代码。
1 仅匹配信用卡号。
我尝试将表达式组合成 or,但最终得到了总共 5 个组(前 2 个表达式中有 2 个组,最后一个表达式有 1 个组)
I am not good with regex, but I have the following, but I assume part of the following means look for 13 - 16 digits and then return a success if it finds 3 - 4 digits after that. The problem is that the 3 - 4 digits are optional and they can also be before the 13 - 16 digit number, so I guess I want to combine a positive lookahead/lookbehind, negative lookahead/lookbehind. This sounds way to complex, is there a simpler way?
(\d{13,16})[<"'].*?(?=[>"']\d{3,4}[<"'])[>"'](\d{3,4})[<"']
which will match the ccnum and the series in the following snippet:
<CreditCard>
name="John Doe""
ccnum=""1111123412341231""
series="339"
exp="03/13">
</CreditCard>
However, if I remove the ccnum or series, it doesn't match anything, and the series can be optional. Also the series can appear before or after the ccnum, so if I put the series attribute before the ccnum attribute, it doesn't match anything either. It also doesn't match if I have a series before a ccnum as separate elements, such as or if I disregard a series element:
<CreditCard>
<series>234</series>
<ccnum>1235583839293838</ccnum>
</CreditCard>
I need the regex match the following scenarios, but I do not know the exact name of the elements, in this case, I just called them ccnum and series.
Here are the ones that work:
<CreditCard>
<ccnum>1235583839293838</ccnum>
<series>123</series>
</CreditCard>
<CreditCard ccnum="1838383838383833">
<series>123</series>
</CreditCard>
<CreditCard ccnum="1838383838383833" series="139"
</CreditCard>
It should also match the following, but does not:
<CreditCard ccnum="1838383838383833"
</CreditCard>
<CreditCard series="139" ccnum="1838383838383833"
</CreditCard>
<CreditCard ccnum="1838383838383833"></CreditCard>
<CreditCard>
<series>123</series>
<ccnum>1235583839293838</ccnum>
</CreditCard>
<CreditCard>
<ccnum series="123">1235583839293838</ccnum>
</CreditCard>
Right now, to get this to work, I am usinng 3 separate regular expressions:
1 to match a credit card number that comes before a security code.
1 to match a security code that comes before a credit card number.
1 to match just a credit card number.
I tried combining the expressions into an or, but I end up with 5 total groups (2 from the first 2 expressions and 1 from the last one)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
使用其 Parse 方法将 XML 拉入 XDocument 可能要容易得多。然后您可以使用 XPath 或其他方式来查找该数据。
至于正则表达式:你的正则表达式对我来说太复杂了,无法理解,但这就是你如何使某个块可选:“(这是可选的)?”。
除非将两个订单手动包含到正则表达式中,否则您无法解释这两个不同的订单。因此,如果您希望能够匹配“ab”和“ba”(不同顺序),则需要以下正则表达式:“((ab)|(ba))”。所以一切都在那里两次。您可以通过将“a”和“b”分别分解为一个字符串变量来减少这种麻烦。
It is probably much easier to pull the XML into an XDocument using its Parse method. Then you can use XPath or other means of finding that data.
As for the regex: You regex is to complex for me to comprehend, but this is how you make a certain block optional: "(thisisoptional)?".
And you cannot account for the two different orders except by including both orders manually into the regex. So if you want to be able to match "ab" and "ba" (different order), you need the following regex: "((ab)|(ba))". So everything is twice in there. You can reduce the nastyness of this by factoring out "a" and "b" into a string variable each.
您可以尝试递归地遍历 XML 文档,并抓取与
ccnum
和series
表达式匹配的每个属性和文本节点,并将它们附加到List中。 ccNumList
和List;系列列表
。如果 ccnum 和series
在 DOM 树层次结构中的顺序相同,则ccNumList[i] == seriesList[i]
。此处是进行递归树遍历的示例。
You could try recursively traversing the XML document and scraping every attribute and text node that matches your expression for
ccnum
andseries
and appending them toList<string> ccNumList
andList<string> seriesList
. Ifccnum
andseries
are in the same order in the DOM tree hierarchy thenccNumList[i] == seriesList[i]
.An example of doing a recursive tree traversal is here.
这将创建三个捕获组,其中 ccnum 始终位于第二组中,而系列可以位于第一组、第三组中,也可以不位于任何组中。
This will create three capture groups, where the
ccnum
is always in the second group, and theseries
can be in the first, the third, or none of the groups.