用于解析斜体文本的正则表达式?
假设我有以下文本:
__This_is__ a __test__
使用两个下划线表示斜体。所以我希望 This_is
和 test
为斜体。逻辑规定两个连续双下划线之间的任何文本都应为斜体,包括可能存在的任何其他数量的下划线。我得到了:
__([^_]+)__
第 1 组中“不是两个连续的下划线”相当于什么?谢谢。
Suppose I have the following text:
__This_is__ a __test__
Using two underscores for denoting italics. So I expect This_is
and test
to be italicized. The logic dictates that any text between two consecutive double underscores should be italicized, including any other number of underscores that may be there. I've got:
__([^_]+)__
What is the equivalent of "not two consecutive underscores" in group 1? Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一种选择是匹配两个下划线:
然后进行否定向前查看当前位置前面是否没有两个下划线:
如果不是这种情况,则匹配任何字符:
并重复前一个或多次:
最后匹配另外两个下划线:
这是最终的解决方案。
一个小演示:
产生:
如 Ideone 上所示。
编辑
请注意,我在演示中使用了非捕获组,否则输出将如下所示:
即
((?!__).)
匹配的最后一个字符将被捕获在组中1.有关组的更多信息,请参阅:http://www.regular-expressions.info/brackets.html
An option would be to match two underscores:
Then make a negative look ahead to see if theres no two underscores ahead of the current position:
if that is not the case, match any character:
and repeat the previous one or more times:
and finally match another two underscores:
which is the final solution.
A little demo:
produces:
as can be seen on Ideone.
EDIT
Note that I used a non-capturing group in my demo, otherwise the output would have looked like this:
i.e. the last character matched by
((?!__).)
would have been captured in group 1.More about groups, see: http://www.regular-expressions.info/brackets.html
http://ideone.com/uHJCC
http://ideone.com/uHJCC