防止正则表达式中的重复匹配

发布于 2024-08-10 10:08:01 字数 452 浏览 5 评论 0原文

以下代码

string expression = "(\\{[0-9]+\\})";
RegexOptions options = ((RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline) | RegexOptions.IgnoreCase);
Regex tokenParser = new Regex(expression, options);

MatchCollection matches = tokenParser.Matches("The {0} is a {1} and the {2} is also a {1}");

将匹配并捕获“{0}”、“{1}”、“{2}”和“{1}”。

是否可以更改它(正则表达式或正则表达式的选项),以便它匹配并捕获“{0}”、“{1}”和“{2}”。换句话说,每场比赛只能被捕获一次?

The following code

string expression = "(\\{[0-9]+\\})";
RegexOptions options = ((RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline) | RegexOptions.IgnoreCase);
Regex tokenParser = new Regex(expression, options);

MatchCollection matches = tokenParser.Matches("The {0} is a {1} and the {2} is also a {1}");

will match and capture "{0}", "{1}", "{2}" and "{1}".

Is it possible to change it (either the regular expression or option of the RegEx) so that it would match and capture "{0}", "{1}" and "{2}". In other words, each match should only be captured once?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

孤独陪着我 2024-08-17 10:08:01

这是我想出的。

private static bool TokensMatch(string t1, string t2)
{
  return TokenString(t1) == TokenString(t2);
}

private static string TokenString(string input)
{
  Regex tokenParser = new Regex(@"(\{[0-9]+\})|(\[.*?\])");

  string[] tokens = tokenParser.Matches(input).Cast<Match>()
      .Select(m => m.Value).Distinct().OrderBy(s => s).ToArray<string>();

  return String.Join(String.Empty, tokens);
}

请注意,正则表达式与我的问题中的正则表达式的差异是由于我满足两种类型的标记这一事实;编号由 {} 分隔,命名由 [] 分隔;

Here is what I came up with.

private static bool TokensMatch(string t1, string t2)
{
  return TokenString(t1) == TokenString(t2);
}

private static string TokenString(string input)
{
  Regex tokenParser = new Regex(@"(\{[0-9]+\})|(\[.*?\])");

  string[] tokens = tokenParser.Matches(input).Cast<Match>()
      .Select(m => m.Value).Distinct().OrderBy(s => s).ToArray<string>();

  return String.Join(String.Empty, tokens);
}

Note that the difference in the regular expression from the one in my question is due to the fact that I cater for two types of token; numbered ones delimited by {} and named ones delimited by [];

小霸王臭丫头 2024-08-17 10:08:01

正则表达式可以解决很多问题,但不是所有问题。使用工具箱中的其他工具怎么样?

var parameters = new HashSet<string>(
    matches.Select(mm => mm.Value).Skip(1));

或者

var parameters = matches.Select(mm => mm.Value).Skip(1).Distinct();

Regular expressions solve lots of problems, but not every problem. How about using other tools in the toolbox?

var parameters = new HashSet<string>(
    matches.Select(mm => mm.Value).Skip(1));

Or

var parameters = matches.Select(mm => mm.Value).Skip(1).Distinct();
茶底世界 2024-08-17 10:08:01

您可以将以下内容用于纯正则表达式解决方案:

Regex r = new Regex(@"(\{[0-9]+\}|\[[^\[\]]+\])(?<!\1.*\1)",
                    RegexOptions.Singleline);

但是为了提高效率和可维护性,您可能最好使用像您发布的那样的混合解决方案。

Here's something you could use for a pure regex solution:

Regex r = new Regex(@"(\{[0-9]+\}|\[[^\[\]]+\])(?<!\1.*\1)",
                    RegexOptions.Singleline);

But for the sake of both efficiency and maintainability, you're probably better off with a mixed solution like the one you posted.

泪之魂 2024-08-17 10:08:01

如果您只想将一个实例更改

string expression = "(\\{[0-9]+\\})"; \\one or more repetitions 

string expression = "(\\{[0-9]{1}})";  \\Exactly 1 repetition

If you only want one instance change

string expression = "(\\{[0-9]+\\})"; \\one or more repetitions 

to

string expression = "(\\{[0-9]{1}})";  \\Exactly 1 repetition
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文