当字符串已经包含 PHP 中的可分割字符时,如何从 preg_split 中排除字符串?

发布于 2024-11-28 06:20:09 字数 562 浏览 0 评论 0原文

我在 PHP 中使用 preg_split 函数来创建一个包含多个不同元素的数组。但是,我想排除一个恰好包含我要进行 preg_splitting 的元素之一的字符串。

$array['stuff'] = preg_split('/\[#]|\ &amp  |\ &amp |\&amp |\&amp|\ &amp|\ &gt  |\ &gt |\&gt |\&gt|\ &gt|\ &  |\ & |\& |\&|\ &|\ \/  |\ \/ |\\/ |\\/|\ \/|\ >  |\ > |\> |\>|\ >|\ ,  |\ , |\, |\,|\, |\ ::  |\ :: |\:: |\ ::|\::|\ ::|\ :  |\ : |\: |\:|\ :|\ -  |\ - |\- |\-|\ -/', $array['stuff'] ) ;

我想做的是将诸如“foo-bar”之类的字符串排除在拆分匹配之外,因为它包含破折号。 “foo-bar”需要与我的目的完全匹配。

I am using the preg_split function in PHP in order to create one array containing several different elements. However, I want to exclude a string which happens to contain one of the elements that I'm preg_splitting by.

$array['stuff'] = preg_split('/\[#]|\ &  |\ & |\& |\&|\ &|\ >  |\ > |\> |\>|\ >|\ &  |\ & |\& |\&|\ &|\ \/  |\ \/ |\\/ |\\/|\ \/|\ >  |\ > |\> |\>|\ >|\ ,  |\ , |\, |\,|\, |\ ::  |\ :: |\:: |\ ::|\::|\ ::|\ :  |\ : |\: |\:|\ :|\ -  |\ - |\- |\-|\ -/', $array['stuff'] ) ;

What I would like to do is to exclude a string such as 'foo-bar' from being matched for a split because it contains a dash. 'foo-bar' would need to be an exact match for my purposes.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

春花秋月 2024-12-05 06:20:09

生成的正则表达式将非常复杂,特别是如果您有很多像“foo-bar”这样的异常。

您应该使用条件子模式,其中lookbehind 作为条件,lookahead 作为其yes 模式:

$res = preg_split('/(?(?<=foo)\-(?!bar)|\-)/', 'aasdf-fafsdf-foo-bar-asdf' );
var_dump( $res );

结果:

array(4) {
  [0]=>
  string(5) "aasdf"
  [1]=>
  string(6) "fafsdf"
  [2]=>
  string(7) "foo-bar"
  [3]=>
  string(4) "asdf"
}

让我解释一下这里发生了什么。 \- 表示

匹配任何破折号字符。

但我们想要的是

匹配任何不属于 foo-bar 的破折号字符。

由于我们无法在正则表达式中实现它,因此我们对其进行了一些更改:

匹配 if 前面是 foo 后面不跟有 bar 的任何破折号字符。

为了实现 if 部分,我们使用条件子模式,语法如下:

(?(condition)yes-pattern|no-pattern)

我们的“条件”将“以 foo 开头”来检查我们是否使用了后向查找:

(?<=foo)

如果是这样,我们应该查找对于“后面不跟条的破折号”,我们使用否定的前瞻:

\-(?!bar)

这成为我们的“是模式”。我们的“无模式”应该是 \- 或“任意破折号”。完整的正则表达式将是:

(?(?<=foo)\-(?!bar)|\-)

更新:将其合并到当前的正则表达式中,最后更改这部分

|\ -  |\ - |\- |\-|\ -/

|\s?(?(?<=foo)\-(?!bar)|\-)\s?/

The resulting regular expression would be very complicated specially if you have a lot exceptions like 'foo-bar'.

You should use a conditional subpattern with a lookbehind as condition and a lookahead as its yes-pattern:

$res = preg_split('/(?(?<=foo)\-(?!bar)|\-)/', 'aasdf-fafsdf-foo-bar-asdf' );
var_dump( $res );

result:

array(4) {
  [0]=>
  string(5) "aasdf"
  [1]=>
  string(6) "fafsdf"
  [2]=>
  string(7) "foo-bar"
  [3]=>
  string(4) "asdf"
}

Let me explain what is happening here. \- means

Match any dash character.

but what we want is

Match any dash character that is not part of foo-bar.

Since we can't implement that in regex as it is we change it a little:

Match any dash character that if preceded by foo is not followed by bar.

To implement the if part we use a conditional subpattern, this is the syntax:

(?(condition)yes-pattern|no-pattern)

Our "condition" would be "preceded by foo" to check for that we use a lookbehind:

(?<=foo)

If that is true we should look for "a dash that is not followed by bar" to do that we use a negative lookahead:

\-(?!bar)

And that becomes our "yes-pattern". Our "no-pattern" should be \- or "any dash". The complete regex would be:

(?(?<=foo)\-(?!bar)|\-)

UPDATE: to incorporate this into your current regex change this part at the end:

|\ -  |\ - |\- |\-|\ -/

to

|\s?(?(?<=foo)\-(?!bar)|\-)\s?/
紧拥背影 2024-12-05 06:20:09

尽管我不能保证我的解决方案比没有人的双重环视模式更有效,但我认为我的解决方案更容易阅读。 (*SKIP)(*FAIL) 有效匹配并丢弃您希望忽略的子字符串。在某些情况下,这种方法可能非常有用/有效/可维护。

代码:(演示)

$string = 'I-like-candy-and-foo-bar-sandwiches';
var_export(preg_split('~foo-bar(*SKIP)(*FAIL)|-~', $string));

输出:

array (
  0 => 'I',
  1 => 'like',
  2 => 'candy',
  3 => 'and',
  4 => 'foo-bar',
  5 => 'sandwiches',
)

说实话,我认为没有人的答案有点过度设计。它可以更简单地写为否定后向查找和否定先行查找......没有条件语法的理由。

代码:(演示)

$string = 'I-like-candy-and-foo-bar-sandwiches';
var_export(preg_split('~(?<!foo)-(?!bar)~', $string));

输出:

array (
  0 => 'I',
  1 => 'like',
  2 => 'candy',
  3 => 'and',
  4 => 'foo-bar',
  5 => 'sandwiches',
)

ps 如果您可能在开头有一个连字符或输入字符串的末尾,并且您不希望 preg_split() 生成空元素,然后使用 0PREG_SPLIT_NO_EMPTY 作为函数调用中的参数 3 和 4(分别)。

Though I make no guarantee that my solution is more efficient than nobody's double lookaround pattern for this case, I think my solution is slightly easier to read. (*SKIP)(*FAIL) effectively matches and discards the substrings that you wish to ignore. In some cases, this approach can be very useful/effective/maintainable.

Code: (Demo)

$string = 'I-like-candy-and-foo-bar-sandwiches';
var_export(preg_split('~foo-bar(*SKIP)(*FAIL)|-~', $string));

Output:

array (
  0 => 'I',
  1 => 'like',
  2 => 'candy',
  3 => 'and',
  4 => 'foo-bar',
  5 => 'sandwiches',
)

To be completely honest, I think nobody's answer is a bit over-engineered. It can be more simply written as a negated lookbehind and a negated lookahead ...no reason for the conditional syntax.

Code: (Demo)

$string = 'I-like-candy-and-foo-bar-sandwiches';
var_export(preg_split('~(?<!foo)-(?!bar)~', $string));

Output:

array (
  0 => 'I',
  1 => 'like',
  2 => 'candy',
  3 => 'and',
  4 => 'foo-bar',
  5 => 'sandwiches',
)

p.s. If you might have a hyphen at the start or end of your input string AND you don't want empty elements to be generated by preg_split(), then use 0 and PREG_SPLIT_NO_EMPTY as parameters 3 and 4 (respectively) in the function call.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文