php preg regex - 多行模式下无换行符的空白组
您好,我正在尝试按行分割一些输入,并在每行上使用trim()。但我想不使用修剪,只使用正则表达式来做到这一点。
我遇到的问题是,行尾的空白没有被修剪掉。我猜我的组 [^$\s] 空格但没有换行符不起作用。
所以问题是,如何解决我的问题,以及如何在 preg 正则表达式中定义一个组,它明确表示忽略换行符?目前我觉得我的做法还是错误的。问题是,如果我写 \s* 而不是这个奇怪的组。 .+ 吃掉所有。如果我写.+?我没有得到包含空格的完整字符串。
preg_match_all("/^\s*+(.+)[^$\s]*+$/m", $_POST['input'], $matches, PREG_SET_ORDER );
Hello I am trying to split some input by line, and use trim() on each line. But I would like to do it without using trim, just with regex.
The issue I am having with this, is that whitspaces at the end of the line are not trimmed away. I guess my group [^$\s] whitespaces but no linebreak does not work.
So the question is, how to solve my problem, and how to define a group in preg regex, which explicitly says ignore line breaks? At the moment I am thinking my approach is still wrong. The problem is, if I write \s* instead of this weird group. .+ eats all. If I write .+? I do not get strings which include spaces back complete.
preg_match_all("/^\s*+(.+)[^$\s]*+$/m", $_POST['input'], $matches, PREG_SET_ORDER );
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
好吧,我通常都赞成使用正则表达式。但这里的
trim
方法会更简单。我假设您避免了它,因为它通常需要额外的循环。但在这种情况下,您可以将其压缩为:但作为您找到的解决方案的替代方案,您可以选择使用:
现在有点黑客。将
^$
替换为\n
,并使用断言在其他地方排除换行符。但\p{Z}
是捕获所有 Unicode 空格字符变体(包括 NBSP 和其他 ninja 占位符)的不错选择。Okay, I'm usually all for using regular expressions. But the
trim
approach would be simpler here. And I assume you avoided it because it usually requires an extra loop. But in this instance you could compact it to:But as alternative to your found solution, you could have alternatively used:
A bit hackish now. Swapped out the
^$
just for\n
, and used assertions to exclude newlines elsewhere. But the\p{Z}
is a nice alternative to catch all Unicode space character variations, including NBSP and other ninja placeholders.您需要一些东西来吃掉捕获组之前的前导空白,包括整行。
\s*
就是这样做的。您不需要强制它从行首开始,无论如何您都不会保存它 - 它的唯一目的是匹配非空白字符之前。现在您知道您正在查看非空白,并且需要捕获同一行上的最后一个非空白。由于
.
不会匹配换行符,因此.*\S
就是这样做的。与您的版本的一个区别是下一个匹配的初始
\s*
会吃掉您刚刚匹配的行上的尾随空格。由于我们不再关心行结尾,因此不再需要/m
修饰符。您可以将第一个星设为所有格 (
\s*+
);这不会改变它匹配的内容,但如果有很长的空尾部,它会使其在文件末尾稍微更快地失败。You need something to eat leading whitespace before your capture group, including whole lines.
\s*
does that. You don't need to force it to start at the beginning of a line, you're not saving it anyway -- its only purpose is to match up to just before a non-whitespace character.So now you know that you're looking at non-whitespace, and need to capture up to the last non-whitespace on the same line. Since
.
won't match newline,.*\S
does just that.One difference from your version is that the initial
\s*
of the next match gets to eat the trailing whitespace on the line you just matched. Since we no longer care about line endings, the/m
modifier is no longer necessary.You could make the first star possessive (
\s*+
); that won't change what it matches, but it will make it fail marginally faster at the end of the file if there's a long empty tail.