当字符串已经包含 PHP 中的可分割字符时,如何从 preg_split 中排除字符串?
我在 PHP 中使用 preg_split 函数来创建一个包含多个不同元素的数组。但是,我想排除一个恰好包含我要进行 preg_splitting 的元素之一的字符串。
$array['stuff'] = preg_split('/\[#]|\ & |\ & |\& |\&|\ &|\ > |\ > |\> |\>|\ >|\ & |\ & |\& |\&|\ &|\ \/ |\ \/ |\\/ |\\/|\ \/|\ > |\ > |\> |\>|\ >|\ , |\ , |\, |\,|\, |\ :: |\ :: |\:: |\ ::|\::|\ ::|\ : |\ : |\: |\:|\ :|\ - |\ - |\- |\-|\ -/', $array['stuff'] ) ;
我想做的是将诸如“foo-bar”之类的字符串排除在拆分匹配之外,因为它包含破折号。 “foo-bar”需要与我的目的完全匹配。
I am using the preg_split function in PHP in order to create one array containing several different elements. However, I want to exclude a string which happens to contain one of the elements that I'm preg_splitting by.
$array['stuff'] = preg_split('/\[#]|\ & |\ & |\& |\&|\ &|\ > |\ > |\> |\>|\ >|\ & |\ & |\& |\&|\ &|\ \/ |\ \/ |\\/ |\\/|\ \/|\ > |\ > |\> |\>|\ >|\ , |\ , |\, |\,|\, |\ :: |\ :: |\:: |\ ::|\::|\ ::|\ : |\ : |\: |\:|\ :|\ - |\ - |\- |\-|\ -/', $array['stuff'] ) ;
What I would like to do is to exclude a string such as 'foo-bar' from being matched for a split because it contains a dash. 'foo-bar' would need to be an exact match for my purposes.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
生成的正则表达式将非常复杂,特别是如果您有很多像“foo-bar”这样的异常。
您应该使用条件子模式,其中lookbehind 作为条件,lookahead 作为其yes 模式:
结果:
让我解释一下这里发生了什么。
\-
表示但我们想要的是
由于我们无法在正则表达式中实现它,因此我们对其进行了一些更改:
为了实现 if 部分,我们使用条件子模式,语法如下:
我们的“条件”将“以 foo 开头”来检查我们是否使用了后向查找:
如果是这样,我们应该查找对于“后面不跟条的破折号”,我们使用否定的前瞻:
这成为我们的“是模式”。我们的“无模式”应该是
\-
或“任意破折号”。完整的正则表达式将是:更新:将其合并到当前的正则表达式中,最后更改这部分
:
The resulting regular expression would be very complicated specially if you have a lot exceptions like 'foo-bar'.
You should use a conditional subpattern with a lookbehind as condition and a lookahead as its yes-pattern:
result:
Let me explain what is happening here.
\-
meansbut what we want is
Since we can't implement that in regex as it is we change it a little:
To implement the if part we use a conditional subpattern, this is the syntax:
Our "condition" would be "preceded by foo" to check for that we use a lookbehind:
If that is true we should look for "a dash that is not followed by bar" to do that we use a negative lookahead:
And that becomes our "yes-pattern". Our "no-pattern" should be
\-
or "any dash". The complete regex would be:UPDATE: to incorporate this into your current regex change this part at the end:
to
尽管我不能保证我的解决方案比没有人的双重环视模式更有效,但我认为我的解决方案更容易阅读。
(*SKIP)(*FAIL)
有效匹配并丢弃您希望忽略的子字符串。在某些情况下,这种方法可能非常有用/有效/可维护。代码:(演示)
输出:
说实话,我认为没有人的答案有点过度设计。它可以更简单地写为否定后向查找和否定先行查找......没有条件语法的理由。
代码:(演示)
输出:
ps 如果您可能在开头有一个连字符或输入字符串的末尾,并且您不希望
preg_split()
生成空元素,然后使用0
并PREG_SPLIT_NO_EMPTY
作为函数调用中的参数 3 和 4(分别)。Though I make no guarantee that my solution is more efficient than nobody's double lookaround pattern for this case, I think my solution is slightly easier to read.
(*SKIP)(*FAIL)
effectively matches and discards the substrings that you wish to ignore. In some cases, this approach can be very useful/effective/maintainable.Code: (Demo)
Output:
To be completely honest, I think nobody's answer is a bit over-engineered. It can be more simply written as a negated lookbehind and a negated lookahead ...no reason for the conditional syntax.
Code: (Demo)
Output:
p.s. If you might have a hyphen at the start or end of your input string AND you don't want empty elements to be generated by
preg_split()
, then use0
andPREG_SPLIT_NO_EMPTY
as parameters 3 and 4 (respectively) in the function call.