正则表达式 - 负向预测以排除字符串

发布于 2024-11-15 14:15:36 字数 502 浏览 2 评论 0原文

我试图在文本中查找(并用其他内容替换)

  1. 以“/”开头的
  2. 所有部分,并在两个“/ ”之间以“/”结尾,
  3. 除了字符串“.”之外,可以有任何内容。和 '..'。

(为了您的信息,我正在搜索并替换目录和文件名,因此应排除“.”和“..”。)

这是我想出的正则表达式:

/(?!\.|\.\.)([^/]+)/

第二部分

([^/]+)

匹配每个字符序列,排除“/”。不需要字符限制,我只是解释输入。

第一部分

(?!\.|\.\.)

使用否定先行断言来排除字符串“.”。和 '..'。

然而,这似乎不适用于 PHP 中的 mb_ereg_replace()。

有人可以帮我吗?我看不出我的正则表达式有什么问题。

谢谢。

I am trying to find (and replace with something else) in a text all parts which

  1. start with '/'
  2. ends with '/'
  3. between the two /'s there can be anything, except the strings '.' and '..'.

(For your info, I am searching for and replacing directory and file names, hence the '.' and '..' should be excluded.)

This is the regular expression I came up with:

/(?!\.|\.\.)([^/]+)/

The second part

([^/]+)

matches every sequence of characters, '/' excluded. There are no character restrictions required, I am simply interpreting the input.

The first part

(?!\.|\.\.)

uses the negative lookahead assertion to exclude the strings '.' and '..'.

However, this doesn't seem to work in PHP with mb_ereg_replace().

Can somebody help me out? I fail to see what's wrong with my regex.

Thank you.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

萌吟 2024-11-22 14:15:36

POSIX 正则表达式可能不支持负向前瞻。 (不过我可能是错的)

无论如何,由于 PCRE 正则表达式通常比 POSIX 更快,我认为您可以使用相同函数的 PCRE 版本,因为 PCRE 也支持 utf8 以及使用 u 标志。

考虑此代码作为替代:

preg_replace('~/(?!\.|\.\.)([^/]+)/~u', "", $str);

编辑:更好的是使用:

preg_replace('~/(?!\.)([^/]+)/~u', "", $str);

POSIX regex probably don't have support for negative lookaheads. (I may be wrong though)

Anyway since PCRE regex are usually faster than POSIX I think you can use PCRE version of the same function since PCRE supports utf8 as well using u flag.

Consider this code as a substitute:

preg_replace('~/(?!\.|\.\.)([^/]+)/~u', "", $str);

EDIT: Even better is to use:

preg_replace('~/(?!\.)([^/]+)/~u', "", $str);
千と千尋 2024-11-22 14:15:36

这有点冗长,但它确实有效:

#/((\.[^./][^/]*)|(\.\.[^/]+)|([^.][^/]*))/#
^  |------------| |---------| |---------|
|        |             |               |
|        |        text starting with   |
|        |        two dots, that isn't |
|        |             "." or ".."     |
|  text starting with                  |
|  a dot, that isn't                text not starting
|  "." or ".."                         with a dot
|
delimiter

不匹配:

  • hi
  • //
  • /./
  • /../

匹配:

  • /hi/
  • /.hi/
  • /..hi/
  • /... >/

http://regexpal.com/

我不确定您是否想要允许 //。如果这样做,请将 * 粘贴在最后一个 / 之前。

This is a little verbose, but it definitely does work:

#/((\.[^./][^/]*)|(\.\.[^/]+)|([^.][^/]*))/#
^  |------------| |---------| |---------|
|        |             |               |
|        |        text starting with   |
|        |        two dots, that isn't |
|        |             "." or ".."     |
|  text starting with                  |
|  a dot, that isn't                text not starting
|  "." or ".."                         with a dot
|
delimiter

Does not match:

  • hi
  • //
  • /./
  • /../

Does match:

  • /hi/
  • /.hi/
  • /..hi/
  • /.../

Have a play around with it on http://regexpal.com/.

I wasn't sure whether or not you wanted to allow //. If you do, stick * before the last /.

夏尔 2024-11-22 14:15:36

我不反对正则表达式,但我会这样做:

function simplify_path($path, $directory_separator = "/", $equivalent = true){
  $path = trim($path);
  // if it's absolute, it stays absolute:
  $prepend = (substr($path,0,1) == $directory_separator)?$directory_separator:"";
  $path_array = explode($directory_separator, $path);
  if($prepend) array_shift($path_array);
  $output = array();
  foreach($path_array as $val){
    if($val != '..' || ((empty($output) || $last == '..') && $equivalent)) {
      if($val != '' && $val != '.'){
        array_push($output, $val);
        $last = $val;
      }
    } elseif(!empty($output)) {
        array_pop($output);
    }
  }
  return $prepend.implode($directory_separator,$output);
}

测试:

echo(simplify_path("../../../one/no/no/../../two/no/../three"));
// =>  ../../../one/two/three
echo(simplify_path("/../../one/no/no/../../two/no/../three"));
// =>  /../../one/two/three
echo(simplify_path("/one/no/no/../../two/no/../three"));
// =>  /one/two/three
echo(simplify_path(".././../../one/././no/./no/../../two/no/../three"));
// =>  ../../../one/two/three
echo(simplify_path(".././..///../one/.///./no/./no/../../two/no/../three/"));
// =>  ../../../one/two/three

我认为返回等效字符串会更好,所以我尊重 .. 在开头的出现细绳。

如果你不需要它们,你可以用第三个参数 $equivalent = false 来调用它:

echo(simplify_path("../../../one/no/no/../../two/no/../three", "/", false));
// =>  one/two/three
echo(simplify_path("/../../one/no/no/../../two/no/../three", "/", false));
// =>  /one/two/three
echo(simplify_path("/one/no/no/../../two/no/../three", "/", false));
// =>  /one/two/three
echo(simplify_path(".././../../one/././no/./no/../../two/no/../three", "/", false));
// =>  one/two/three
echo(simplify_path(".././..///../one/.///./no/./no/../../two/no/../three/", "/", false));
// =>  one/two/three

I'm not against regex, but I would have done this instead:

function simplify_path($path, $directory_separator = "/", $equivalent = true){
  $path = trim($path);
  // if it's absolute, it stays absolute:
  $prepend = (substr($path,0,1) == $directory_separator)?$directory_separator:"";
  $path_array = explode($directory_separator, $path);
  if($prepend) array_shift($path_array);
  $output = array();
  foreach($path_array as $val){
    if($val != '..' || ((empty($output) || $last == '..') && $equivalent)) {
      if($val != '' && $val != '.'){
        array_push($output, $val);
        $last = $val;
      }
    } elseif(!empty($output)) {
        array_pop($output);
    }
  }
  return $prepend.implode($directory_separator,$output);
}

Tests:

echo(simplify_path("../../../one/no/no/../../two/no/../three"));
// =>  ../../../one/two/three
echo(simplify_path("/../../one/no/no/../../two/no/../three"));
// =>  /../../one/two/three
echo(simplify_path("/one/no/no/../../two/no/../three"));
// =>  /one/two/three
echo(simplify_path(".././../../one/././no/./no/../../two/no/../three"));
// =>  ../../../one/two/three
echo(simplify_path(".././..///../one/.///./no/./no/../../two/no/../three/"));
// =>  ../../../one/two/three

I thought that it would be better to return an equivalent string, so I respected the ocurrences of .. at the begining of the string.

If you dont want them, you can call it with the third parameter $equivalent = false:

echo(simplify_path("../../../one/no/no/../../two/no/../three", "/", false));
// =>  one/two/three
echo(simplify_path("/../../one/no/no/../../two/no/../three", "/", false));
// =>  /one/two/three
echo(simplify_path("/one/no/no/../../two/no/../three", "/", false));
// =>  /one/two/three
echo(simplify_path(".././../../one/././no/./no/../../two/no/../three", "/", false));
// =>  one/two/three
echo(simplify_path(".././..///../one/.///./no/./no/../../two/no/../three/", "/", false));
// =>  one/two/three
但可醉心 2024-11-22 14:15:36

/(?!(\.|\.\.)/)([^/]+)/
这将允许 ... 作为有效名称。

/(?!(\.|\.\.)/)([^/]+)/
This will allow ... as a valid name.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文