正则表达式匹配字符串内容直到注释

发布于 2024-11-13 11:15:49 字数 747 浏览 1 评论 0原文

我试图匹配字符串中 [%___%] 中包含的表达式，在 // （注释）之前，不包括 //用引号括起来（在字符串内）
例如
[%tag%] = "a" + "//" + [%tag2%]; //[%tag3%]
应该匹配 [%tag%] 和 [%tag2%]

我能得到的最接近的是 ^(?:(?:\[%([^% \]\[]*)%\])|[^"]|"[^"]*")*?(?://)

所以我遇到的问题是这不匹配任何不以 // 结尾的字符串
事实上，它会聚合行，直到它可以结束为包含 //
我试图在最后用 ?.*?$ 来解决这个问题，以表示 // 不是必需的并转到第一行，但它真的不起作用。

其次，它只捕获第二个标签。这不是因为 "//" 因为即使使用 [%1%] [%2%] 它也不会捕获

我使用 C# 的第一个和带有 RegexOptions.Multiline 选项的 Regex.Matches ，这是我的转义字符串

"^(?:(?:\\[%([^%\\]\\[]*)%\\])|[^\"]|\"[^\"]*\")*?(?://)"

原文

I'm trying to match to expresions contained within [%___%] in a string, before // (comments) excluding // that are in quotations (inside a string)
so for example
[%tag%] = "a" + "//" + [%tag2%]; //[%tag3%]
should match [%tag%] and [%tag2%]

The closest I can get is ^(?:(?:\[%([^%\]\[]*)%\])|[^"]|"[^"]*")*?(?://)

So the problems I'm having are that this doesn't match any strings which don't end in //
In fact, it aggregates lines until it can conclude in one that contains //
I've tried to remedy this problem with ?.*?$ at the end, to signify that // is not necessary and to go to the first endline, but it doesn't really work.

And Secondly, it only captures the second tag. This isn't because of the "//" since even with [%1%] [%2%] it won't capture the first

I'm using C# and Regex.Matches with the RegexOptions.Multiline option and this is my escaped string

"^(?:(?:\\[%([^%\\]\\[]*)%\\])|[^\"]|\"[^\"]*\")*?(?://)"

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

狠疯拽 2024-11-20 11:15:49

首先，我想说我喜欢正则表达式。我几年前读过 Friedl 的《掌握正则表达式》，并且从未回头。话虽如此，不要不要使用一个巨大的正则表达式来解决这个问题。使用您的编程语言。您最终将获得更具可读性和可维护性的代码。看起来您正在尝试解析一种语言，其中不同的规则适用于不同的上下文。您的模式可能出现在带引号的字符串中。带引号的字符串内部可能包含需要转义的引号。在一个正则表达式中捕获所有微妙之处将是一场噩梦。我建议逐个字符地遍历字符串，一路构建标记，查找引号，并跟踪是否位于带引号的字符串中。当您遇到与您的条件匹配的标记（您可以在这部分使用正则表达式）并且您不在字符串中时，请将其添加到您的列表中。当您到达语句的末尾并遇到注释的开头时，请丢弃剩余的字符，直到注释的末尾。

回复收藏 0 原文

放手` 2024-11-20 11:15:49

我认为一次性完成此操作有点困难，因为双引号匹配很难检查。您可以分两个阶段进行：

¤ 删除所有匹配的双引号
¤ 寻找你的模式

Regex re1 = new Regex(@"""[^""]*""", RegexOptions.Multiline);
Regex re2 = new Regex(@"(?<!//.*)\[%\w+%\]", RegexOptions.Multiline);
string input = @"[%tag%] = ""a"" + ""//"" + [%tag2%]; //[%tag3%]
[%tag%] = ""a"" + ""ii//"" + [%tag2%]; //[%tag3%]";

MatchCollection ms = re2.Matches(re1.Replace(input, ""));

I think doing this in one shot is a little difficult because of double quotes matching being difficult to check. You can do it in two phases:

¤ Removing all matching double quotes
¤ Finding your pattern

Regex re1 = new Regex(@"""[^""]*""", RegexOptions.Multiline);
Regex re2 = new Regex(@"(?<!//.*)\[%\w+%\]", RegexOptions.Multiline);
string input = @"[%tag%] = ""a"" + ""//"" + [%tag2%]; //[%tag3%]
[%tag%] = ""a"" + ""ii//"" + [%tag2%]; //[%tag3%]";

MatchCollection ms = re2.Matches(re1.Replace(input, ""));

回复收藏 0 原文

~没有更多了~