使用字符串标签实现通用 Regex.Matches

发布于 2025-01-11 10:52:42 字数 749 浏览 0 评论 0原文

我有一个函数可以获取字符串的 2 个标签内的内容:

 string content = string.Empty;
 foreach (Match match in Regex.Matches(stringSource, "<tag1>(.*?)</tag1>"))
 {
    content = match.Groups[1].Value;
 }

我需要使用不同的标签多次执行此操作。我想更新方法,以便可以传入开始结束标记,但我无法将标记的参数与正则表达式连接起来。当我将这些值传递给新函数时,表达式不起作用:

public string GetContent(string stringSource, string openTag, string closeTag)
{
    string content = string.Empty;
    foreach (Match match in Regex.Matches(stringSource, $"{openTag}(.*?){closeTag}"))
    {
        content = match.Groups[1].Value;
    }

    return content;
}

我想像这样使用该函数:

string content = GetContent(sourceString, "<tag1>", "</tag1>");

如何才能使其工作?

I have a function that gets the content inside 2 tags of a string:

 string content = string.Empty;
 foreach (Match match in Regex.Matches(stringSource, "<tag1>(.*?)</tag1>"))
 {
    content = match.Groups[1].Value;
 }

I need to do this operation many times with different tags. I want to update method so I can pass in the opening closing tags, but I can't concatenate the parameters of my tags with the regular expression. When I pass these values to the new function, the expression does not work:

public string GetContent(string stringSource, string openTag, string closeTag)
{
    string content = string.Empty;
    foreach (Match match in Regex.Matches(stringSource, 
quot;{openTag}(.*?){closeTag}"))
    {
        content = match.Groups[1].Value;
    }

    return content;
}

I want to use the function like this:

string content = GetContent(sourceString, "<tag1>", "</tag1>");

How can I make this work?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

万水千山粽是情ミ 2025-01-18 10:52:42

试试这个:

public IEnumerable<string> GetContent(string stringSource, string tag)
{
    foreach (Match match in Regex.Matches(stringSource, $"<{tag}>(.*?)</{tag}>"))
    {
        yield return match.Groups[1].Value;
    }
}

// ...

var content = GetContent(sourceString, "tag1");

注意我还更改了返回类型。您之前的操作相当于像这样调用此函数: string content = GetContent(sourceString, "tag").LastOrDefault();

另外,正则表达式通常不是处理 HTML 和 XML 的糟糕选择。围绕此存在各种边缘情况,因此正则表达式确实不能很好地工作。

如果您可以将输入限制为语言的子集以限制边缘情况,那么您可以让它看起来工作,这可能会让您度过一段时间,但通常有人最终会想要使用更多标记语言的功能,您将开始遇到奇怪的错误和错误。 使用专门的、专门构建的解析器,您确实会做得更好!

Try this:

public IEnumerable<string> GetContent(string stringSource, string tag)
{
    foreach (Match match in Regex.Matches(stringSource, 
quot;<{tag}>(.*?)</{tag}>"))
    {
        yield return match.Groups[1].Value;
    }
}

// ...

var content = GetContent(sourceString, "tag1");

Note I also changed the return type. What you had before was the equivalent of calling this function like this: string content = GetContent(sourceString, "tag").LastOrDefault();

Also, Regex is generally a poor choice for handling HTML and XML. There are all kind of edge cases around this, such that RegEx really doesn't work that well.

You can make it seem to work if you can constrain your input to a subset of the language to limit edge cases, and that might get you by for a while, but usually someone will eventually want to use more of the features of the markup language and you'll start getting weird bugs and errors. You'll really do much better with a dedicated, purpose-built parser!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文