提取 URI 的第二个路径段的正则表达式是什么？

发布于 2024-10-02 05:27:09 字数 1373 浏览 1 评论 0原文

我只需要提取 URI 的第二个路径段，即给定以下 URI：

/first/second/third/fourth/...

正则表达式应从 URI 中提取 second 字符串。对解决方案正则表达式的解释将不胜感激。

我正在使用 POSIX 投诉正则表达式库。

编辑： Gumbo 给出的解决方案适用于 REtester

但是，它似乎不适用于以下代码：

#include "regex.h"
char *regexp (const char *string, const char *patrn, int *begin, int *end){     
        int i, w=0, len;                  
        char *word = NULL;
        regex_t rgT;
        regmatch_t match;
        wsregcomp(&rgT,patrn,REG_EXTENDED);
        if ((wsregexec(&rgT,string,1,&match,0)) == 0) {
                *begin = (int)match.rm_so;
                *end = (int)match.rm_eo;
                len = *end-*begin;
                word = (char*) malloc(len+1);
                for (i=*begin; i<*end; i++) {
                        word[w] = string[i];
                        w++; }
                word[w]=0;
        }
        wsregfree(&rgT);
        return word;
}

int main(){
    int begin = 0;
    int end = 0;

    char *word = regexp("/first/second/third","^/[^/]+/([^/]*)",&begin,&end);
    printf("ENV %s\n",word);
}

上面打印 /first/second 而不是仅 second

EDIT2: java.util.regex 的结果也相同。

原文

I need to extract only the second path segment of a URI i.e. given the following URI:

/first/second/third/fourth/...

the regex should extract the second string from the URI. An explanation of the solution regex would be greatly appreciated.

I am using POSIX complaint regex library.

EDIT:
The solution given by Gumbo works at REtester

But, it doesn't seem to work with the code below:

#include "regex.h"
char *regexp (const char *string, const char *patrn, int *begin, int *end){     
        int i, w=0, len;                  
        char *word = NULL;
        regex_t rgT;
        regmatch_t match;
        wsregcomp(&rgT,patrn,REG_EXTENDED);
        if ((wsregexec(&rgT,string,1,&match,0)) == 0) {
                *begin = (int)match.rm_so;
                *end = (int)match.rm_eo;
                len = *end-*begin;
                word = (char*) malloc(len+1);
                for (i=*begin; i<*end; i++) {
                        word[w] = string[i];
                        w++; }
                word[w]=0;
        }
        wsregfree(&rgT);
        return word;
}

int main(){
    int begin = 0;
    int end = 0;

    char *word = regexp("/first/second/third","^/[^/]+/([^/]*)",&begin,&end);
    printf("ENV %s\n",word);
}

The above prints /first/second instead of only second

EDIT2:
Same result with java.util.regex as well.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

忆离笙 2024-10-09 05:27:09

如果您只有绝对 URI 路径，那么这个正则表达式应该可以做到：

^/[^/]+/([^/]*)

解释：

^/ 匹配字符串的开头，后跟文字 /
[^/]+/ 匹配除 / 之外的一个或多个字符，后跟文字 /< /code>
([^/]*) 匹配除 / 之外的零个或多个字符。

然后第二路径段与第一组匹配。我使用 + 作为第一个，使用 * 作为第二个，因为如果第一个也允许零长度，那么它就不再是绝对路径，而是一个方案 -更少的 URI。

If you’re just having an absolute URI path, then this regular expression should do it:

^/[^/]+/([^/]*)

An explanation:

^/ matches the start of the string followed by a literal /
[^/]+/ matches one or more characters except /, followed by a literal /
([^/]*) matches zero or more characters except /.

The second path segment is then matched by the first group. I used + for the first and * for the second because if the first would also allow a zero length, it wouldn’t be an absolute path any more but a scheme-less URI.

回复收藏 0 原文

~没有更多了~