防止正则表达式中的重复匹配

发布于 2024-08-10 10:08:01 字数 452 浏览 5 评论 0原文

以下代码

string expression = "(\\{[0-9]+\\})";
RegexOptions options = ((RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline) | RegexOptions.IgnoreCase);
Regex tokenParser = new Regex(expression, options);

MatchCollection matches = tokenParser.Matches("The {0} is a {1} and the {2} is also a {1}");

将匹配并捕获“{0}”、“{1}”、“{2}”和“{1}”。

是否可以更改它（正则表达式或正则表达式的选项），以便它匹配并捕获“{0}”、“{1}”和“{2}”。换句话说，每场比赛只能被捕获一次？

原文

The following code

string expression = "(\\{[0-9]+\\})";
RegexOptions options = ((RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline) | RegexOptions.IgnoreCase);
Regex tokenParser = new Regex(expression, options);

MatchCollection matches = tokenParser.Matches("The {0} is a {1} and the {2} is also a {1}");

will match and capture "{0}", "{1}", "{2}" and "{1}".

Is it possible to change it (either the regular expression or option of the RegEx) so that it would match and capture "{0}", "{1}" and "{2}". In other words, each match should only be captured once?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

孤独陪着我 2024-08-17 10:08:01

这是我想出的。

private static bool TokensMatch(string t1, string t2)
{
  return TokenString(t1) == TokenString(t2);
}

private static string TokenString(string input)
{
  Regex tokenParser = new Regex(@"(\{[0-9]+\})|(\[.*?\])");

  string[] tokens = tokenParser.Matches(input).Cast<Match>()
      .Select(m => m.Value).Distinct().OrderBy(s => s).ToArray<string>();

  return String.Join(String.Empty, tokens);
}

请注意，正则表达式与我的问题中的正则表达式的差异是由于我满足两种类型的标记这一事实；编号由 {} 分隔，命名由 [] 分隔；

Here is what I came up with.

private static bool TokensMatch(string t1, string t2)
{
  return TokenString(t1) == TokenString(t2);
}

private static string TokenString(string input)
{
  Regex tokenParser = new Regex(@"(\{[0-9]+\})|(\[.*?\])");

  string[] tokens = tokenParser.Matches(input).Cast<Match>()
      .Select(m => m.Value).Distinct().OrderBy(s => s).ToArray<string>();

  return String.Join(String.Empty, tokens);
}

Note that the difference in the regular expression from the one in my question is due to the fact that I cater for two types of token; numbered ones delimited by {} and named ones delimited by [];

回复收藏 0 原文

小霸王臭丫头 2024-08-17 10:08:01

正则表达式可以解决很多问题，但不是所有问题。使用工具箱中的其他工具怎么样？

var parameters = new HashSet<string>(
    matches.Select(mm => mm.Value).Skip(1));

或者

var parameters = matches.Select(mm => mm.Value).Skip(1).Distinct();

Regular expressions solve lots of problems, but not every problem. How about using other tools in the toolbox?

var parameters = new HashSet<string>(
    matches.Select(mm => mm.Value).Skip(1));

var parameters = matches.Select(mm => mm.Value).Skip(1).Distinct();

回复收藏 0 原文

茶底世界 2024-08-17 10:08:01

您可以将以下内容用于纯正则表达式解决方案：

Regex r = new Regex(@"(\{[0-9]+\}|\[[^\[\]]+\])(?<!\1.*\1)",
                    RegexOptions.Singleline);

但是为了提高效率和可维护性，您可能最好使用像您发布的那样的混合解决方案。

Here's something you could use for a pure regex solution:

Regex r = new Regex(@"(\{[0-9]+\}|\[[^\[\]]+\])(?<!\1.*\1)",
                    RegexOptions.Singleline);

But for the sake of both efficiency and maintainability, you're probably better off with a mixed solution like the one you posted.

回复收藏 0 原文

泪之魂 2024-08-17 10:08:01

如果您只想将一个实例更改

string expression = "(\\{[0-9]+\\})"; \\one or more repetitions

为

string expression = "(\\{[0-9]{1}})";  \\Exactly 1 repetition

If you only want one instance change

string expression = "(\\{[0-9]+\\})"; \\one or more repetitions

string expression = "(\\{[0-9]{1}})";  \\Exactly 1 repetition

回复收藏 0 原文

~没有更多了~

关于作者

风向决定发型

暂无简介

0 文章

0 评论

22 人气

关注发私信

友情链接

文江博客

防止正则表达式中的重复匹配

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

留蓝

18790681156

zach7772

Wini

ayeshaaroy

初雪

友情链接

防止正则表达式中的重复匹配

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

留蓝

18790681156

zach7772

Wini

ayeshaaroy

初雪

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。