REGEX匹配如果字符串包含新行或不包含(多行)

发布于 2025-02-12 00:40:27 字数 614 浏览 2 评论 0原文

我正在尝试解析其中一些是单行日志的日志文件,有些是多行的。我的正则是单行的正则效果不错,但对多线效果不佳。

^(?<timestamp>\d+-\d+-\d+T\d+:\d+:\d+\.\d+(\+|-)\d+:\d+)\s+\[(?<severity>\w+)\](?<message>.*)$

这是比赛失败的地方,因为它在新行之后未检测到字符串。

2022-06-27T15:22:35.508+00:00 [Info] New settings received:
{"indexer.settings.compaction.days_of_week":"Sunday,Monday"}

新行应包括在“消息”组中,直到它检测到新的时间戳为止。

我尝试了多种方法,以包括要匹配的新线,但尚未找到任何解决方案。我在链接中粘贴了两种日志格式: https://regex101.com/r/r/r/ftj3uz/1 < /a>。

I am trying to parse log files where some of them are single line logs, some are multiline. The regex I have works fine for single lines but not for multi-lines.

^(?<timestamp>\d+-\d+-\d+T\d+:\d+:\d+\.\d+(\+|-)\d+:\d+)\s+\[(?<severity>\w+)\](?<message>.*)$

This is where the match is failing because it does not detect the string after the new line.

2022-06-27T15:22:35.508+00:00 [Info] New settings received:
{"indexer.settings.compaction.days_of_week":"Sunday,Monday"}

The new line should be included in the "message" group until it detects a new timestamp.

I tried multiple approaches to include the newline to be matched but didn't find any solution yet. I have pasted both log formats in the link: https://regex101.com/r/ftJ3UZ/1.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

只为一人 2025-02-19 00:40:28

如果支持lookahead,则可以在消息组中放置一个可选的重复组,该组检查下一行是否不以Datelike模式开头或完整的时间戳。

^(?<timestamp>\d+-\d+-\d+T\d+:\d+:\d+\.\d+([+-])\d+:\d+)\s+\[(?<severity>\w+)\](?<message>.*(?:\n(?!\d+-\d+-\d+T).*)*)$

If a lookahead is supported, you can put an optional repeating group in the message group checking that the next line does not start with a datelike pattern, or the full timestamp.

^(?<timestamp>\d+-\d+-\d+T\d+:\d+:\d+\.\d+([+-])\d+:\d+)\s+\[(?<severity>\w+)\](?<message>.*(?:\n(?!\d+-\d+-\d+T).*)*)$

Regex demo

抹茶夏天i‖ 2025-02-19 00:40:28

看来这是匹配的:

^(?<timestamp>\d+-\d+-\d+T\d+:\d+:\d+\.\d+(\+|-)\d+:\d+)\s+\[(?<severity>\w+)\](?<message>.*)\n(?:{.*})?

我已删除$,并添加了\ n(?:{。代码> {}括号。

It seems this would match:

^(?<timestamp>\d+-\d+-\d+T\d+:\d+:\d+\.\d+(\+|-)\d+:\d+)\s+\[(?<severity>\w+)\](?<message>.*)\n(?:{.*})?

I've removed $ and added \n(?:{.*})? to the end to be able to match optional part inside {} braces.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文