正则表达式 - 负向预测以排除字符串

发布于 2024-11-15 14:15:36 字数 502 浏览 2 评论 0原文

我试图在文本中查找（并用其他内容替换）

以“/”开头的
所有部分，并在两个“/ ”之间以“/”结尾，
除了字符串“.”之外，可以有任何内容。和 '..'。

（为了您的信息，我正在搜索并替换目录和文件名，因此应排除“.”和“..”。）

这是我想出的正则表达式：

/(?!\.|\.\.)([^/]+)/

第二部分

([^/]+)

匹配每个字符序列，排除“/”。不需要字符限制，我只是解释输入。

第一部分

(?!\.|\.\.)

使用否定先行断言来排除字符串“.”。和 '..'。

然而，这似乎不适用于 PHP 中的 mb_ereg_replace()。

有人可以帮我吗？我看不出我的正则表达式有什么问题。

谢谢。

原文

I am trying to find (and replace with something else) in a text all parts which

start with '/'
ends with '/'
between the two /'s there can be anything, except the strings '.' and '..'.

(For your info, I am searching for and replacing directory and file names, hence the '.' and '..' should be excluded.)

This is the regular expression I came up with:

/(?!\.|\.\.)([^/]+)/

The second part

([^/]+)

matches every sequence of characters, '/' excluded. There are no character restrictions required, I am simply interpreting the input.

The first part

(?!\.|\.\.)

uses the negative lookahead assertion to exclude the strings '.' and '..'.

However, this doesn't seem to work in PHP with mb_ereg_replace().

Can somebody help me out? I fail to see what's wrong with my regex.

Thank you.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

萌吟 2024-11-22 14:15:36

POSIX 正则表达式可能不支持负向前瞻。（不过我可能是错的）

无论如何，由于 PCRE 正则表达式通常比 POSIX 更快，我认为您可以使用相同函数的 PCRE 版本，因为 PCRE 也支持 utf8 以及使用 u 标志。

考虑此代码作为替代：

preg_replace('~/(?!\.|\.\.)([^/]+)/~u', "", $str);

编辑：更好的是使用：

preg_replace('~/(?!\.)([^/]+)/~u', "", $str);

POSIX regex probably don't have support for negative lookaheads. (I may be wrong though)

Anyway since PCRE regex are usually faster than POSIX I think you can use PCRE version of the same function since PCRE supports utf8 as well using u flag.

Consider this code as a substitute:

preg_replace('~/(?!\.|\.\.)([^/]+)/~u', "", $str);

EDIT: Even better is to use:

preg_replace('~/(?!\.)([^/]+)/~u', "", $str);

回复收藏 0 原文

千と千尋 2024-11-22 14:15:36

这有点冗长，但它确实有效：

#/((\.[^./][^/]*)|(\.\.[^/]+)|([^.][^/]*))/#
^  |------------| |---------| |---------|
|        |             |               |
|        |        text starting with   |
|        |        two dots, that isn't |
|        |             "." or ".."     |
|  text starting with                  |
|  a dot, that isn't                text not starting
|  "." or ".."                         with a dot
|
delimiter

不匹配：

hi
//
/./
/../

匹配：

/hi/
/.hi/
/..hi/
/... >/

在 http://regexpal.com/。

我不确定您是否想要允许 //。如果这样做，请将 * 粘贴在最后一个 / 之前。

This is a little verbose, but it definitely does work:

#/((\.[^./][^/]*)|(\.\.[^/]+)|([^.][^/]*))/#
^  |------------| |---------| |---------|
|        |             |               |
|        |        text starting with   |
|        |        two dots, that isn't |
|        |             "." or ".."     |
|  text starting with                  |
|  a dot, that isn't                text not starting
|  "." or ".."                         with a dot
|
delimiter

Does not match:

hi
//
/./
/../

Does match:

/hi/
/.hi/
/..hi/
/.../

Have a play around with it on http://regexpal.com/.

I wasn't sure whether or not you wanted to allow //. If you do, stick * before the last /.

回复收藏 0 原文

夏尔 2024-11-22 14:15:36

我不反对正则表达式，但我会这样做：

function simplify_path($path, $directory_separator = "/", $equivalent = true){
  $path = trim($path);
  // if it's absolute, it stays absolute:
  $prepend = (substr($path,0,1) == $directory_separator)?$directory_separator:"";
  $path_array = explode($directory_separator, $path);
  if($prepend) array_shift($path_array);
  $output = array();
  foreach($path_array as $val){
    if($val != '..' || ((empty($output) || $last == '..') && $equivalent)) {
      if($val != '' && $val != '.'){
        array_push($output, $val);
        $last = $val;
      }
    } elseif(!empty($output)) {
        array_pop($output);
    }
  }
  return $prepend.implode($directory_separator,$output);
}

测试：

echo(simplify_path("../../../one/no/no/../../two/no/../three"));
// =>  ../../../one/two/three
echo(simplify_path("/../../one/no/no/../../two/no/../three"));
// =>  /../../one/two/three
echo(simplify_path("/one/no/no/../../two/no/../three"));
// =>  /one/two/three
echo(simplify_path(".././../../one/././no/./no/../../two/no/../three"));
// =>  ../../../one/two/three
echo(simplify_path(".././..///../one/.///./no/./no/../../two/no/../three/"));
// =>  ../../../one/two/three

我认为返回等效字符串会更好，所以我尊重 .. 在开头的出现细绳。

如果你不需要它们，你可以用第三个参数 $equivalent = false 来调用它：

echo(simplify_path("../../../one/no/no/../../two/no/../three", "/", false));
// =>  one/two/three
echo(simplify_path("/../../one/no/no/../../two/no/../three", "/", false));
// =>  /one/two/three
echo(simplify_path("/one/no/no/../../two/no/../three", "/", false));
// =>  /one/two/three
echo(simplify_path(".././../../one/././no/./no/../../two/no/../three", "/", false));
// =>  one/two/three
echo(simplify_path(".././..///../one/.///./no/./no/../../two/no/../three/", "/", false));
// =>  one/two/three

I'm not against regex, but I would have done this instead:

function simplify_path($path, $directory_separator = "/", $equivalent = true){
  $path = trim($path);
  // if it's absolute, it stays absolute:
  $prepend = (substr($path,0,1) == $directory_separator)?$directory_separator:"";
  $path_array = explode($directory_separator, $path);
  if($prepend) array_shift($path_array);
  $output = array();
  foreach($path_array as $val){
    if($val != '..' || ((empty($output) || $last == '..') && $equivalent)) {
      if($val != '' && $val != '.'){
        array_push($output, $val);
        $last = $val;
      }
    } elseif(!empty($output)) {
        array_pop($output);
    }
  }
  return $prepend.implode($directory_separator,$output);
}

Tests:

echo(simplify_path("../../../one/no/no/../../two/no/../three"));
// =>  ../../../one/two/three
echo(simplify_path("/../../one/no/no/../../two/no/../three"));
// =>  /../../one/two/three
echo(simplify_path("/one/no/no/../../two/no/../three"));
// =>  /one/two/three
echo(simplify_path(".././../../one/././no/./no/../../two/no/../three"));
// =>  ../../../one/two/three
echo(simplify_path(".././..///../one/.///./no/./no/../../two/no/../three/"));
// =>  ../../../one/two/three

I thought that it would be better to return an equivalent string, so I respected the ocurrences of .. at the begining of the string.

If you dont want them, you can call it with the third parameter $equivalent = false:

echo(simplify_path("../../../one/no/no/../../two/no/../three", "/", false));
// =>  one/two/three
echo(simplify_path("/../../one/no/no/../../two/no/../three", "/", false));
// =>  /one/two/three
echo(simplify_path("/one/no/no/../../two/no/../three", "/", false));
// =>  /one/two/three
echo(simplify_path(".././../../one/././no/./no/../../two/no/../three", "/", false));
// =>  one/two/three
echo(simplify_path(".././..///../one/.///./no/./no/../../two/no/../three/", "/", false));
// =>  one/two/three

回复收藏 0 原文