删除正则表达式中不必要的括号

发布于 2024-09-04 04:24:39 字数 151 浏览 9 评论 0原文

假设我(在 JavaScript 正则表达式中)

((((A)B)C)D)

当然是这样的:

ABCD

是否有一种算法可以消除这样的字符串中不必要的括号?

Suppose I have (in a javascript regular expression)

((((A)B)C)D)

Of course that really reads

ABCD

Is there an algorithm to eliminate unnecessary parentheses in a string like that?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

云柯 2024-09-11 04:24:39

此函数将删除所有后面没有量词的组,并且不是环视。它假定 ECMAScript 风格的正则表达式,并且捕获组 (( ... )) 并不重要。

function removeUnnecessaryParenthesis(s) {
   // Tokenize the pattern
   var pieces = s.split(/(\\.|\[(?:\\.|[^\]\\])+]|\((?:\?[:!=])?|\)(?:[*?+]\??|\{\d+,?\d*}\??)?)/g);
   var stack = [];
   for (var i = 0; i < pieces.length; i++) {
      if (pieces[i].substr(0,1) == "(") {
         // Opening parenthesis
         stack.push(i);
      } else if (pieces[i].substr(0,1) == ")") {
         // Closing parenthesis
         if (stack.length == 0) {
            // Unbalanced; Just skip the next one.
            continue;
         }
         var j = stack.pop();
         if ((pieces[j] == "(" || pieces[j] == "(?:") && pieces[i] == ")") {
             // If it is a capturing group, or a non-capturing group, and is
             // not followed by a quantifier;
             // Clear both the opening and closing pieces.
             pieces[i] = "";
             pieces[j] = "";
         }
      }
   }
   return pieces.join("");
}

示例:

removeUnnecessaryParenthesis("((((A)B)C)D)")  --> "ABCD"
removeUnnecessaryParenthesis("((((A)?B)C)D)") --> "(A)?BCD"
removeUnnecessaryParenthesis("((((A)B)?C)D)") --> "(AB)?CD"

它不会尝试确定括号是否仅包含单个标记 ((A)?)。这将需要更长的标记化模式。

This function will remove all groups that is not followed by a quantifier, and is not a look-around. It assumes ECMAScript flavor regex, and that capture-groups (( ... )) are unimportant.

function removeUnnecessaryParenthesis(s) {
   // Tokenize the pattern
   var pieces = s.split(/(\\.|\[(?:\\.|[^\]\\])+]|\((?:\?[:!=])?|\)(?:[*?+]\??|\{\d+,?\d*}\??)?)/g);
   var stack = [];
   for (var i = 0; i < pieces.length; i++) {
      if (pieces[i].substr(0,1) == "(") {
         // Opening parenthesis
         stack.push(i);
      } else if (pieces[i].substr(0,1) == ")") {
         // Closing parenthesis
         if (stack.length == 0) {
            // Unbalanced; Just skip the next one.
            continue;
         }
         var j = stack.pop();
         if ((pieces[j] == "(" || pieces[j] == "(?:") && pieces[i] == ")") {
             // If it is a capturing group, or a non-capturing group, and is
             // not followed by a quantifier;
             // Clear both the opening and closing pieces.
             pieces[i] = "";
             pieces[j] = "";
         }
      }
   }
   return pieces.join("");
}

Examples:

removeUnnecessaryParenthesis("((((A)B)C)D)")  --> "ABCD"
removeUnnecessaryParenthesis("((((A)?B)C)D)") --> "(A)?BCD"
removeUnnecessaryParenthesis("((((A)B)?C)D)") --> "(AB)?CD"

It does not try to determine if the parenthesis contains only a single token ((A)?). That would require a longer tokenizing pattern.

想你只要分分秒秒 2024-09-11 04:24:39

1)使用理解括号的解析器

2)使用可以匹配括号的Perl递归正则表达式(恕我直言,在这种情况下不鼓励)我不认为Boost正则表达式支持所需的递归类型。

3)也许需要它们?别管他们了。

1) Use a parser that understands parenthesis

2) Use a Perl recursive regex that can match parenthesis (discouraged in this case IMHO) I don't think Boost regex's support the type of recursion needed.

3) Perhaps they are needed? Leave them alone.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文