正则表达式错过字符串中的比赛

发布于 2025-02-13 05:04:05 字数 971 浏览 0 评论 0原文

我正在尝试写一个正则表达式,该表达式捕获了弦之间所需的字符串 (“ f38”,“ f38”,“ f1”,“ ..”)和(“ \ par”,“ \ hich”,“”,“ {”,“}”,“,”,“ ..”)文件并将每个匹配项附加到数组中,最终将其打印到新文件中。

我在捕获“ f38”和“ \ hich”之间的某些字符串时遇到了问题(通常在字符串跨越多行时,但我在示例字符串片段中至少有1个例外, 'm在Regex101.com上使用)

这是正则表达式,因为我现在有

(?<=f38  |f38 | |f1 |\.\.)\w.+(?=\\par|\\cell |\\hich|{|}|\\|\.\.)

麻烦的比赛包括“ \ hich”。像“ e \ hich”和“ d \ hich”,我想在这些示例中分别匹配“ e”和“ d”。我认为问题在于以某种方式处理Newline/Line-Breaks。

这是输入字符串的较小片段,我对有问题的匹配进行了粗体,并大写了。由此我想要“ e”而不是\ hich。请注意,上面有两个示例,其中有两个示例,“ \ hich”不包括在比赛中。

l \ hich \ af38 \ dbCh \ af31505 \ loch \ f38 .. ikely涉及棉石膏暴露:去除,封装,封装,更改,修复,维护,维护,维护,隔离,隔离,溢出,溢出/紧急清理,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,溢出,清理,清理,清除,杂物ACM的运输,处置和存储。一般行业标准涵盖了暴露于ASB 的所有其他操作 stos是可能

的href =“ https://regex101.com/r/qqtls7/1” rel =“ nofollow noreferrer”> regex101.com

任何帮助都将不胜感激。谢谢!

I'm trying to write a regular expression that captures desired strings between strings
("f38 ","f38 ","f1 ", "..") and ("\par","\hich","{","}","","..") from a decompiled DOC file and append each match to an array to eventually be printed out into a new file.

I'm having an issue with catching certain strings between "f38 " and "\hich" (usually when the string spans multiple lines but there is at least 1 exception to this I've found in the example string snippet of the DOC file I'm using on regex101.com)

Here is the regular expression as I have it now

(?<=f38  |f38 | |f1 |\.\.)\w.+(?=\\par|\\cell |\\hich|{|}|\\|\.\.)

The troublesome matches come out including "\hich". Like "e\hich" and "d\hich" and I want to match "e" and "d" respectively in these examples not the \hich portion. I'm thinking the problem is with handling the newline/line-breaks somehow.

Here is a smaller snippet of the input string, I have bolded what is matched and bolded + capitalized the problematic match. From this I want the "e" not the \hich. Note that above there are 2 examples of things going right and "\hich" is not included in the match.

l\hich\af38\dbch\af31505\loch\f38 ..ikely to involve asbestos exposure: removal, encapsulation, alteration, repair, maintenance, insulation, spill/emergency clean-up, transportation, disposal and storage of ACM. The general industry standards cover all other operations where exposure to asb..\hich\af38\dbch\af31505\loch\f38 E\HICH\af38\dbch\af31505\loch\f38 stos is possible

Here is an example with a longer portion of the input string at regex101.com

Any help would be appreciated. Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

云柯 2025-02-20 05:04:05

问题在于您要匹配这些单字符样本的部分。 \w。+需要至少两个字符要匹配。因此,对于当您获得“ E \ hich”时,第一个后斜线会匹配到Regex中的DOT,并持续到下一个后背(这是Regex的正面LookAhead部分中列出的“终结者”之一)。

您可能需要使用*而不是+

The problem is with the part you want to match those single-character samples. \w.+ requires at least two characters to match. So, for when you get "e\hich" that first backslash get matched to the dot in regex and lasts until the next backslash (which is one of the "terminators" listed in the positive lookahead portion of the regex).

You might want to use * instead of +.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文