提取 URI 的第二个路径段的正则表达式是什么?
我只需要提取 URI 的第二个路径段,即给定以下 URI:
/first/second/third/fourth/...
正则表达式应从 URI 中提取 second
字符串。对解决方案正则表达式的解释将不胜感激。
我正在使用 POSIX 投诉正则表达式库。
编辑: Gumbo 给出的解决方案适用于 REtester
但是,它似乎不适用于以下代码:
#include "regex.h"
char *regexp (const char *string, const char *patrn, int *begin, int *end){
int i, w=0, len;
char *word = NULL;
regex_t rgT;
regmatch_t match;
wsregcomp(&rgT,patrn,REG_EXTENDED);
if ((wsregexec(&rgT,string,1,&match,0)) == 0) {
*begin = (int)match.rm_so;
*end = (int)match.rm_eo;
len = *end-*begin;
word = (char*) malloc(len+1);
for (i=*begin; i<*end; i++) {
word[w] = string[i];
w++; }
word[w]=0;
}
wsregfree(&rgT);
return word;
}
int main(){
int begin = 0;
int end = 0;
char *word = regexp("/first/second/third","^/[^/]+/([^/]*)",&begin,&end);
printf("ENV %s\n",word);
}
上面打印 /first/second
而不是仅 second
EDIT2: java.util.regex
的结果也相同。
I need to extract only the second path segment of a URI i.e. given the following URI:
/first/second/third/fourth/...
the regex should extract the second
string from the URI. An explanation of the solution regex would be greatly appreciated.
I am using POSIX complaint regex library.
EDIT:
The solution given by Gumbo works at REtester
But, it doesn't seem to work with the code below:
#include "regex.h"
char *regexp (const char *string, const char *patrn, int *begin, int *end){
int i, w=0, len;
char *word = NULL;
regex_t rgT;
regmatch_t match;
wsregcomp(&rgT,patrn,REG_EXTENDED);
if ((wsregexec(&rgT,string,1,&match,0)) == 0) {
*begin = (int)match.rm_so;
*end = (int)match.rm_eo;
len = *end-*begin;
word = (char*) malloc(len+1);
for (i=*begin; i<*end; i++) {
word[w] = string[i];
w++; }
word[w]=0;
}
wsregfree(&rgT);
return word;
}
int main(){
int begin = 0;
int end = 0;
char *word = regexp("/first/second/third","^/[^/]+/([^/]*)",&begin,&end);
printf("ENV %s\n",word);
}
The above prints /first/second
instead of only second
EDIT2:
Same result with java.util.regex
as well.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果您只有绝对 URI 路径,那么这个正则表达式应该可以做到:
解释:
^/
匹配字符串的开头,后跟文字/
[^/]+/
匹配除/
之外的一个或多个字符,后跟文字/< /code>
([^/]*)
匹配除/
之外的零个或多个字符。然后第二路径段与第一组匹配。我使用
+
作为第一个,使用*
作为第二个,因为如果第一个也允许零长度,那么它就不再是绝对路径,而是一个方案 -更少的 URI。If you’re just having an absolute URI path, then this regular expression should do it:
An explanation:
^/
matches the start of the string followed by a literal/
[^/]+/
matches one or more characters except/
, followed by a literal/
([^/]*)
matches zero or more characters except/
.The second path segment is then matched by the first group. I used
+
for the first and*
for the second because if the first would also allow a zero length, it wouldn’t be an absolute path any more but a scheme-less URI.