正则表达式以正向查找方式匹配多个字符串

发布于 2024-11-09 13:03:16 字数 809 浏览 0 评论 0原文

所以我一直试图结合这两个问题的答案:
C# 分割字符串但保留分割字符\分隔符
匹配多个字符串的正则表达式

本质上,我希望能够围绕某些字符串拆分字符串,并将拆分字符串放在以及 Regex.Split() 的输出数组。这是我到目前为止所尝试过的:

// ** I'd also like to have UNION ALL but not sure how to add that
private const string CompoundSelectRegEx = @"(?<=[\b(UNION|INTERSECT|EXCEPT)\b])";
string sql = "SELECT TOP 5 * FROM Persons UNION SELECT TOP 5 * FROM Persons INTERSECT SELECT TOP 5 * FROM Persons EXCEPT SELECT TOP 5 * FROM Persons";

string[] strings = Regex.Split(sql, CompoundSelectRegEx);

问题是它开始匹配 E 和 U 等单个字符,所以我得到了不正确的字符串数组。

我也想围绕 UNION ALL 进行匹配,但因为这不仅仅是一个单词,而是一个字符串,我不确定如何将其添加到上面的正则表达式中,所以如果有人也能指出我正确的方向,那就太好了!

谢谢!

So I have been trying to combine the answers of these two questions:
C# split string but keep split chars\seperators
Regex to match multiple strings

Essentially I'd like to be able to split a string around certain strings and have the splitting strings in the output array of Regex.Split() as well. Here is what I have tried so far:

// ** I'd also like to have UNION ALL but not sure how to add that
private const string CompoundSelectRegEx = @"(?<=[\b(UNION|INTERSECT|EXCEPT)\b])";
string sql = "SELECT TOP 5 * FROM Persons UNION SELECT TOP 5 * FROM Persons INTERSECT SELECT TOP 5 * FROM Persons EXCEPT SELECT TOP 5 * FROM Persons";

string[] strings = Regex.Split(sql, CompoundSelectRegEx);

The problem is that it starts matching individual characters like E and U so I get an incorrect array of strings.

I'd also like to match around UNION ALL but since thats not just a single word but a string I wasn't sure how to add it the above regex so if someone could point me in the right direction there as well that would be great!

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

不及他 2024-11-16 13:03:16

如果您想拆分这些单词并将它们包含在结果中,只需交替使用它们并将它们放在一个组中即可。无需东张西望。此模式应该满足您的需求:

string pattern = @"\b(UNION(?:\sALL)?|INTERSECT|EXCEPT)\b";

(?:\sALL)? 使单词 ALL 选择性匹配。 (?:...) 部分表示匹配但不捕获指定的模式。组末尾的尾随 ? 使其可选。如果您想修剪结果,可以在模式末尾添加 \s*

请注意,这可能适用于简单的 SQL 语句,但一旦开始处理嵌套查询,上述方法可能会失效。此时正则表达式可能不是最好的解决方案,您应该开发一个解析器。

If you want to split on those words and include them in the results simply alternate on them and place them in a group. There's no need for look-arounds. This pattern should fit your needs:

string pattern = @"\b(UNION(?:\sALL)?|INTERSECT|EXCEPT)\b";

The (?:\sALL)? makes the word ALL optionally matched. The (?:...) part means match but don't capture the specified pattern. The trailing ? at the end of the group makes it optional. If you want to trim the results you could add a \s* at the end of the pattern.

Be aware that this might work for simple SQL statements, but once you start dealing with nested queries the above approach will probably break down. At that point a regex might not be the best solution and you should develop a parser instead.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文