PHP Regexp (PCRE) - 查找所有子字符串 2 的集合

发布于 2024-10-21 12:24:43 字数 913 浏览 3 评论 0原文

例如,有一个源字符串:

__aaXXccYYeeXX_ZZkkYYmmXX_ZZnnXXooYYuuXX_ZZvv..

如何找到所有:aaXX*YY*ZZ

__ aaXX cc YY eeXX_ ZZ kkYYmmXX_ZZnnXXooYYuuXX_ZZvv..

__ < strong>aaXX cc YY eeXX_ZZkkYYmmXX_ ZZ nnXXooYYuuXX_ZZvv..

__ aaXX cc YY eeXX_ZZkkYYmmXX_ZZnnXXooYYuXX_ ZZ vv..

__ aaXX ccYYeeXX_ZZkk YY mmXX_ ZZ nnXXooYYuuXX_ZZvv..

__ aaXX ccYYeeXX_ZZkk YY mmXX_ZZnnXXooYYuuXX_ ZZ vv..

__ aaXX ccYYeeXX_ZZkkYYmmXX_ZZnnXXoo YY uuXX_ ZZ vv..

问题是 PHP preg 不支持 (?<=exp) 后向断言中的 ?+* (可变长度)(仅允许固定长度 {N})。

因此需要不使用可变长度的后向断言的解决方案。

谢谢你!

For example have a source string:

__aaXXccYYeeXX_ZZkkYYmmXX_ZZnnXXooYYuuXX_ZZvv..

How can I find all: aaXX*YY*ZZ

__ aaXX cc YY eeXX_ ZZ kkYYmmXX_ZZnnXXooYYuuXX_ZZvv..

__ aaXX cc YY eeXX_ZZkkYYmmXX_ ZZ nnXXooYYuuXX_ZZvv..

__ aaXX cc YY eeXX_ZZkkYYmmXX_ZZnnXXooYYuuXX_ ZZ vv..

__ aaXX ccYYeeXX_ZZkk YY mmXX_ ZZ nnXXooYYuuXX_ZZvv..

__ aaXX ccYYeeXX_ZZkk YY mmXX_ZZnnXXooYYuuXX_ ZZ vv..

__ aaXX ccYYeeXX_ZZkkYYmmXX_ZZnnXXoo YY uuXX_ ZZ vv..

The problem is that a PHP preg doesn't support ?+* (variable length) in (?<=exp) lookbehind assertion (allow only with fixed length {N}).

So need solution without using lookbehind assertion with variable length.

Thank you!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

有木有妳兜一样 2024-10-28 12:24:43

该脚本的工作原理:

<?php // test.php 20110311_1200
    $data = '__aaXXccYYeeXX_ZZkkYYmmXX_ZZnnXXooYYuuXX_ZZvv..';
    $all_matches = array();
    $yy_match = true; // Get past first for test condition.
    for ($yy_cnt = 1; $yy_match; ++$yy_cnt) {
        $yy_match = false; // Assume failure for this yy_cnt.
        $zz_match = true; // Get past first for test condition.
        for ($zz_cnt = 1; $zz_match; ++$zz_cnt) {
            $zz_match = false; // Assume failure for this zz_cnt.
            // Assemble new regex with new $yy_cnt and $zz_cnt.
            $re = "/ # Match all combinations of XX..YY..ZZ.
                (aaXX)                   # $1: Prefix X.
                (?:                      # Group to find YY[yy_cnt].
                  (?:(?!YY).)*           # Zero or more non-YY.
                  (YY)                   # $2: next YY.
                ){{$yy_cnt}}             # yy_cnt.
                (?:                      # Group to find ZZ[zz_cnt].
                  (?:(?!ZZ).)*           # Zero or more non-ZZ.
                  (ZZ)                   # $3 next ZZ.
                ){{$zz_cnt}}             # $zz_cnt.
                /x";
            if (preg_match($re, $data, $matches, PREG_OFFSET_CAPTURE)) {
                $zz_match = true;
                $yy_match = true;
                $all_matches[] = $matches;
                printf("Match found. \$yy_cnt = %d, \$zz_cnt = %d\n",
                    $yy_cnt, $zz_cnt);
            }
        }
    }
    print_r($all_matches);
?>

This script works:

<?php // test.php 20110311_1200
    $data = '__aaXXccYYeeXX_ZZkkYYmmXX_ZZnnXXooYYuuXX_ZZvv..';
    $all_matches = array();
    $yy_match = true; // Get past first for test condition.
    for ($yy_cnt = 1; $yy_match; ++$yy_cnt) {
        $yy_match = false; // Assume failure for this yy_cnt.
        $zz_match = true; // Get past first for test condition.
        for ($zz_cnt = 1; $zz_match; ++$zz_cnt) {
            $zz_match = false; // Assume failure for this zz_cnt.
            // Assemble new regex with new $yy_cnt and $zz_cnt.
            $re = "/ # Match all combinations of XX..YY..ZZ.
                (aaXX)                   # $1: Prefix X.
                (?:                      # Group to find YY[yy_cnt].
                  (?:(?!YY).)*           # Zero or more non-YY.
                  (YY)                   # $2: next YY.
                ){{$yy_cnt}}             # yy_cnt.
                (?:                      # Group to find ZZ[zz_cnt].
                  (?:(?!ZZ).)*           # Zero or more non-ZZ.
                  (ZZ)                   # $3 next ZZ.
                ){{$zz_cnt}}             # $zz_cnt.
                /x";
            if (preg_match($re, $data, $matches, PREG_OFFSET_CAPTURE)) {
                $zz_match = true;
                $yy_match = true;
                $all_matches[] = $matches;
                printf("Match found. \$yy_cnt = %d, \$zz_cnt = %d\n",
                    $yy_cnt, $zz_cnt);
            }
        }
    }
    print_r($all_matches);
?>
宣告ˉ结束 2024-10-28 12:24:43

你需要循环。首先查找 __aaXX,后跟下一个 YY,然后查找 __aaXX,后跟第二个 YY 等等。在正则表达式中这意味着您首先查找 __aaXX(.*?YY){1},然后查找 __aaXX(.*?YY){2} (您能在其中看到循环变量吗?那里?)等等,直到模式失败。当您寻找 ZZ 时,第二部分也是如此。

You need to loop. First look for __aaXX followed by the next YY, then __aaXX followed by the second YY etc. In regex land that means you first look for __aaXX(.*?YY){1}, then __aaXX(.*?YY){2} (can you see a loop variable in there?) and so on until the pattern fails. Same for the second part when you are looking for the ZZs.

情深缘浅 2024-10-28 12:24:43

这个模式怎么样:# aaXX(.*) YY (.*) ZZ .*#

从您的突出显示来看,并不完全清楚您的结果应该是什么样子...我添加了空格,因为您在突出显示中包含了它们,但不清楚您是否会将它们包含在源中...

编辑

我想我不明白你想要得到什么,但另一件事要注意的是 preg_match_all,如果你的 YY ZZ 部分重复......类似 #_aaXX((.*?)YY(.*? )ZZ)+#

How about this pattern: # aaXX(.*) YY (.*) ZZ .*#?

From your highlighting it's not entirely clear what your result should look like... I added spaces because you have them in the highlighting, but it's not clear if you'll have them in your source...

Edit

I guess I'm not understanding what you want to get, but another thing to look at is preg_match_all, if your YY ZZ part repeats... Something like #_aaXX((.*?)YY(.*?)ZZ)+#.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文