C# regex.split 方法在括号前添加空字符串

发布于 2024-11-19 01:18:12 字数 723 浏览 2 评论 0原文

我有一些代码将方程式输入标记为字符串数组:

string infix = "( 5 + 2 ) * 3 + 4";
string[] tokens = tokenizer(infix, @"([\+\-\*\(\)\^\\])");
foreach (string s in tokens)
{
   Console.WriteLine(s);
}

现在这是标记器函数:

public string[] tokenizer(string input, string splitExp)
        {
            string noWSpaceInput = Regex.Replace(input, @"\s", "");
            Console.WriteLine(noWSpaceInput);
            Regex RE = new Regex(splitExp);
            return (RE.Split(noWSpaceInput));
        }

当我运行此函数时,我将所有字符分开,但在括号字符之前插入了一个空字符串...我该如何删除这?

//此处为空字符串

(

5

+

2

//此处为空字符串

)

*

3

+

4

I have some code that tokenizes a equation input into a string array:

string infix = "( 5 + 2 ) * 3 + 4";
string[] tokens = tokenizer(infix, @"([\+\-\*\(\)\^\\])");
foreach (string s in tokens)
{
   Console.WriteLine(s);
}

Now here is the tokenizer function:

public string[] tokenizer(string input, string splitExp)
        {
            string noWSpaceInput = Regex.Replace(input, @"\s", "");
            Console.WriteLine(noWSpaceInput);
            Regex RE = new Regex(splitExp);
            return (RE.Split(noWSpaceInput));
        }

When I run this, I get all characters split, but there is an empty string inserted before the parenthesis chracters...how do I remove this?

//empty string here

(

5

+

2

//empty string here

)

*

3

+

4

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

所谓喜欢 2024-11-26 01:18:12

我会把它们过滤掉:

public string[] tokenizer(string input, string splitExp)
{
    string noWSpaceInput = Regex.Replace(input, @"\s", "");
    Console.WriteLine(noWSpaceInput);
    Regex RE = new Regex(splitExp);
    return (RE.Split(noWSpaceInput)).Where(x => !string.IsNullOrEmpty(x)).ToArray();
}

I would just filter them out:

public string[] tokenizer(string input, string splitExp)
{
    string noWSpaceInput = Regex.Replace(input, @"\s", "");
    Console.WriteLine(noWSpaceInput);
    Regex RE = new Regex(splitExp);
    return (RE.Split(noWSpaceInput)).Where(x => !string.IsNullOrEmpty(x)).ToArray();
}
仄言 2024-11-26 01:18:12

您所看到的是因为您没有任何分隔符(即在字符串的开头是(),然后是两个相邻的分隔符(即)* 在中间)。这是设计使然。

正如您可能在 String.Split 中发现的那样,该方法有一个可选的枚举,您可以提供该枚举以让它删除任何空条目,但是,正则表达式没有这样的参数。在您的具体情况下,您可以简单地忽略长度为 0 的任何标记。

foreach (string s in tokens.Where(tt => tt.Length > 0))
{
   Console.WriteLine(s);
}

What you're seeing is because you have nothing then a separator (i.e. at the beginning of the string is(), then two separator characters next to one another (i.e. )* in the middle). This is by design.

As you may have found with String.Split, that method has an optional enum which you can give to have it remove any empty entries, however, there is no such parameter with regular expressions. In your specific case you could simply ignore any token with a length of 0.

foreach (string s in tokens.Where(tt => tt.Length > 0))
{
   Console.WriteLine(s);
}
绾颜 2024-11-26 01:18:12

好吧,一种选择是事后过滤掉它们:

return RE.Split(noWSpaceInput).Where(x => !string.IsNullOrEmpty(x)).ToArray();

Well, one option would be to filter them out afterwards:

return RE.Split(noWSpaceInput).Where(x => !string.IsNullOrEmpty(x)).ToArray();
是你 2024-11-26 01:18:12

试试这个(如果你不想过滤结果):

tokenizer(infix, @"(?=[-+*()^\\])|(?<=[-+*()^\\])");

Perl demo:

perl -E "say join ',', split /(?=[-+*()^])|(?<=[-+*()^])/, '(5+2)*3+4'"
(,5,+,2,),*,3,+,4

不过在这种情况下,我认为最好使用匹配而不是拆分。

Try this (if you don't want to filter the result):

tokenizer(infix, @"(?=[-+*()^\\])|(?<=[-+*()^\\])");

Perl demo:

perl -E "say join ',', split /(?=[-+*()^])|(?<=[-+*()^])/, '(5+2)*3+4'"
(,5,+,2,),*,3,+,4

Altho it would be better to use a match instead of split in this case imo.

葬シ愛 2024-11-26 01:18:12

我认为你可以通过分割使用 [StringSplitOptions.RemoveEmptyEntries]

    static void Main(string[] args)
    {
        string infix = "( 5 + 2 ) * 3 + 4";
        string[] results = infix.Split(" ".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
        foreach (var result in results)
            Console.WriteLine(result);

        Console.ReadLine();
    }

I think you can use the [StringSplitOptions.RemoveEmptyEntries] by the split

    static void Main(string[] args)
    {
        string infix = "( 5 + 2 ) * 3 + 4";
        string[] results = infix.Split(" ".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
        foreach (var result in results)
            Console.WriteLine(result);

        Console.ReadLine();
    }
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文