由各种分隔符分割,同时保留分隔符?
我想分割文本过公元年?因为无论你如何选择。简体字危及了对古代文学的研究输入!
使用这三个(或更多)?!。字符作为分隔符。 我当然可以用
来做到这一点 $lines = preg_split('/[。,!,?]/u',$body);
但是我不想让结果行保留其结束分隔符。另外一个句子可能会像这样啊。。。
或什么!??!!!!
I would like to split a text过公元年?因为无论你如何选择。简体字危及了对古代文学的研究输入!
Using on of these three (or more) ?!。 characters as delimiter.
i can do this of course with$lines = preg_split('/[。,!,?]/u',$body);
However i wan't to have the resulting lines keep their ending delimiter. Also a sentence might end like so 啊。。。
or 什么!??!!!!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
试试这个:
它在一个分隔符前面有分隔符但后面没有分隔符的位置处进行分割。它不消耗分隔符,如果有两个或多个连续的分隔符,它只匹配最后一个分隔符之后的分隔符。
Try this:
It splits at a position that's preceded by one of your delimiter characters but not followed by one. It doesn't consume the delimiter, and if there are two or more consecutive delimiters, it only matches after the last one.
在这种情况下,您想自己编写字符串分割器。并保持整个分隔符连续。 (您可以设置一个状态变量来指示它是在文本块还是分隔符块中)。
In this case, you'd like to write the string splitter yourself. And keep continuous delimiters as a whole. (you can set a state variable indicating whether it is in text block or delimiter block).
您应该使用
preg_match_all
而不是preg_split
,即参见 http: //www.ideone.com/rN7MB 使用。
You should use
preg_match_all
instead ofpreg_split
, i.e.See http://www.ideone.com/rN7MB for usage.