按模式将文本脚本拆分为子字符串

发布于 2024-09-27 14:44:40 字数 1069 浏览 12 评论 0原文

考虑下面的脚本（这在伪语言中完全是无意义的）：

if (Request.hostMatch("asfasfasf.com") && someString.existsIn(new String[] {"brr", "hrr"}))   {
    if (Requqest.clientIp("10.0.x.x")) {
        somevar = "1";
    }
    somevar = "2";
}
else {
    somevar = "first";
}
string foo = "foo";
// etc. etc.

你如何从中获取 if-block 的参数和内容？ if 块的格式为：

if<whitespace>(<parameters>)<whitespace>{<contents>}<anything>

我尝试使用 String.split() 以及正则表达式模式 ^if\s*\(|\)\s*\{|\}\s * 但这惨败了。也就是说，问题在于 ) { 也出现在内部 if 块中，并且结束的 } 也出现在很多地方。我认为懒惰或急切的扩张在这里都不起作用。

那么...有什么指示可以指出我在这里可能需要什么才能使用正则表达式实现此功能吗？

我还需要获取没有 if 块代码的剩余字符串（因此代码从 else { ... 开始）。仅使用 String.split() 似乎会变得很困难，因为没有有关被解析的部分的长度的信息。

我最初为此创建了一个基于循环的解决方案（大量使用 String.substring()），但它很乏味。我想要一些更奇特的东西。我应该使用正则表达式还是创建一个自定义的通用函数（除此之外还有很多其他情况），该函数采用可解析的字符串和模式（考虑 if(... 模式上面）？

编辑：更改了变量赋值的返回值，否则就没有意义。

原文

Consider following script (it's total nonsense in pseudo-language):

if (Request.hostMatch("asfasfasf.com") && someString.existsIn(new String[] {"brr", "hrr"}))   {
    if (Requqest.clientIp("10.0.x.x")) {
        somevar = "1";
    }
    somevar = "2";
}
else {
    somevar = "first";
}
string foo = "foo";
// etc. etc.

How would you grab if-block's parameters and contents from it? The if-block has format of:

if<whitespace>(<parameters>)<whitespace>{<contents>}<anything>

I tried using String.split() with regex pattern of ^if\s*\(|\)\s*\{|\}\s* but this fails miserably. Namely, the problem is that ) { is found also in inner if-block and the closing } is found from many places as well. I don't think neither lazy or eager expansion works here.

So... any pointers to what might I need here in order to implement this with regex?

I also need to get the remaining string without the if-block's code (so code starting from else { ...). Using just String.split() seems to make it difficult as there is no information about the length of the parts that were parsed away.

I initially created a loop based solution (using String.substring() heavily) for this, but it's dull. I would like to have something fancier instead. Should I go with regex or create a custom, generic function (there are many other cases than just this) that takes the parseable String and the pattern instead (consider the if<whitespace>(... pattern above)?

Edit: Changed returns to variable assignments as it would have not made sense otherwise.

分享到QQ

分享到微博