正则表达式:从单词到单词匹配字符串
我想从一段文本中提取一个字符串。该字符串必须以某个字符串开头。
示例:
单词 1 =“Hello”
Word 2 =“World”
文本:
Hello, this is a sentence.
The whole World can read this.
What World?
我想要提取的文本是:
Hello, this is a sentence.
The whole World
我应该使用哪种常规异常来提取字符串。
注意:字符串“World”出现两次。
谢谢
I want to extract a string from a piece of text. This string must start end end with a certain string.
Example:
Word 1 = "Hello"
Word 2 = "World"
Text:
Hello, this is a sentence.
The whole World can read this.
What World?
The piece of text i want to extract is:
Hello, this is a sentence.
The whole World
What kind of regular exception should i use for extraction of the string.
Note: the string 'World' occurs twice.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
哪里有“.”也匹配换行符!请注意单词边界 \b,您不想匹配任何不完全是 Hello 或 World 的内容,就好像这些单词是其他单词的一部分一样。
注意修改后的 s 指示
也可以匹配换行符。
Where the "." also matches newline! Note the word boundaries \b, you don't want to match anything which is not exactly Hello or World, as if those words were part of other words.
Note the s modified which instructs
to match newline characters too.
最简单的选择是使用惰性量词 (
*?
)。将从第一个 Hello 到第一个 World 进行匹配。 (记住/s
标志,对于点全)如果您也不希望捕获文本包含
Hello
,这可能是一个问题。一个更偷偷摸摸的选择是:或者
The simplest option is using a lazy quantifier (
*?
). The would match from the first Hello to the first World. (remember the/s
flag, for dot-all)This can be a problem if you don't want the capture text to contain
Hello
either. A more sneaky option then is:Or