当前位置：文江博客话题详情

正则表达式匹配两个字符串之间的所有字符

发布于 2024-11-09 13:18:29 字数 121 浏览 0 评论 0原文

例子： 这只是\一个简单的句子。

我想匹配 This is 和 sentence 之间的每个字符。应忽略换行符。我无法弄清楚正确的语法。

收藏 0

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

评论（18）

怪我鬧 2024-11-16 13:18:29

例如

(?<=This is)(.*)(?=sentence)

我使用了lookbehind (?<=) 并向前看(?=) 这样“This is”和“sentence”就不会包含在匹配中，但这取决于您的用例，您也可以简单地编写 This is(.*)sentence< /代码>。

这里重要的是激活正则表达式引擎的“dotall”模式，以便 . 与换行符匹配。但如何执行此操作取决于您的正则表达式引擎。

接下来的事情是您是否使用 .* 或 .*?。第一个是贪婪的，将匹配到字符串中的最后一个“句子”，第二个是惰性的，将匹配到字符串中的下一个“句子”。

更新

This is(?s)(.*)sentence

(?s) 打开 dotall 修饰符的位置，使得匹配换行符的 . 。

更新 2：

(?<=is \()(.*?)(?=\s*\))

与您的示例“这是（一个简单的）句子”匹配。请参阅此处 Regexr

For example

(?<=This is)(.*)(?=sentence)

I used lookbehind (?<=) and look ahead (?=) so that "This is" and "sentence" is not included in the match, but this is up to your use case, you can also simply write This is(.*)sentence.

The important thing here is that you activate the "dotall" mode of your regex engine, so that the . is matching the newline. But how you do this depends on your regex engine.

The next thing is if you use .* or .*?. The first one is greedy and will match till the last "sentence" in your string, the second one is lazy and will match till the next "sentence" in your string.

Update

This is(?s)(.*)sentence

Where the (?s) turns on the dotall modifier, making the . matching the newline characters.

Update 2:

(?<=is \()(.*?)(?=\s*\))

is matching your example "This is (a simple) sentence". See here on Regexr

回复收藏 0 原文

嗳卜坏 2024-11-16 13:18:29

需要惰性量词

复活这个问题，因为接受的答案中的正则表达式对我来说似乎不太正确。为什么？因为

(?<=This is)(.*)(?=sentence)

将匹配我的第一句话。这是我的第二句话，这是我的第一句话。这是我的第二句话。

查看演示。

您需要在两个环视之间使用一个惰性量词。添加 ? 会使星星变懒。

这符合您的需求：

(?<=This is).*?(?=sentence)

查看演示。我删除了不需要的捕获组。

跨换行符匹配的 DOTALL 模式

请注意，在演示中，设置了“点匹配换行符模式”（又名）dot-all（请参阅如何在各种语言中启用 DOTALL）。在许多正则表达式风格中，您可以使用在线修饰符 (?s) 进行设置，将表达式转换为：

(?s)(?<=This is).*?(?=sentence)

参考

Lazy Quantifier Needed

Resurrecting this question because the regex in the accepted answer doesn't seem quite correct to me. Why? Because

(?<=This is)(.*)(?=sentence)

will match my first sentence. This is my second in This is my first sentence. This is my second sentence.

You need a lazy quantifier between the two lookarounds. Adding a ? makes the star lazy.

This matches what you want:

(?<=This is).*?(?=sentence)

See demo. I removed the capture group, which was not needed.

DOTALL Mode to Match Across Line Breaks

Note that in the demo the "dot matches line breaks mode" (a.k.a.) dot-all is set (see how to turn on DOTALL in various languages). In many regex flavors, you can set it with the online modifier (?s), turning the expression into:

(?s)(?<=This is).*?(?=sentence)

Reference

回复收藏 0 原文

爱格式化 2024-11-16 13:18:29

尝试 This is[\s\S]*?sentence，适用于 javascript

回复收藏 0 原文

格子衫的從容 2024-11-16 13:18:29

这：

This is (.*?) sentence

在 JavaScript 中工作。

This:

This is (.*?) sentence

works in javascript.

回复收藏 0 原文

爱你是孤单的心事 2024-11-16 13:18:29

使用这个：(?<=beginningstringname)(.*\n?)(?=endstringname)

回复收藏 0 原文

独自唱情﹋歌 2024-11-16 13:18:29

这对我有用（我正在使用 VS Code）：

用于：
这只是\一个简单的句子

使用：
这个.+句子

回复收藏 0 原文

百变从容 2024-11-16 13:18:29

您可以简单地使用：\This is .*? \句子

回复收藏 0 原文

野鹿林 2024-11-16 13:18:29

如果是 JavaScript，您可以使用 [^] 来匹配包括换行符在内的任何字符。

使用 /s 标志和点 . 来匹配任何字符也可以，但应用于整个模式，并且 JavaScript 不支持内联修饰符打开/关闭该标志。

要匹配尽可能少的字符，您可以通过附加问号来使量词非贪婪，并使用捕获组以提取之间的部分。

This is([^]*?)sentence

请参阅 regex101 演示。

作为旁注，为了不匹配部分单词，您可以使用单词边界，例如 \bThis 和 sentence\b

const s = "This is just\na simple sentence";
const regex = /This is([^]*?)sentence/;
const m = s.match(regex);

if (m) {
  console.log(m[1]);
}

JavaScript 中的环视变体是 (?<=This is)[^]*?(?=sentence) 并且您可以检查 JS 正则表达式中的 Lookbehind 支持。

另请参阅有关 Lookbehind 的重要说明。

const s = "This is just\na simple sentence";
const regex = /(?<=This is)[^]*?(?=sentence)/;
const m = s.match(regex);

if (m) {
  console.log(m[0]);
}

In case of JavaScript you can use [^] to match any character including newlines.

Using the /s flag with a dot . to match any character also works, but is applied to the whole pattern and JavaScript does not support inline modifiers to turn on/off the flag.

To match as least as possible characters, you can make the quantifier non greedy by appending a question mark, and use a capture group to extract the part in between.

This is([^]*?)sentence

See a regex101 demo.

As a side note, to not match partial words you can use word boundaries like \bThis and sentence\b

const s = "This is just\na simple sentence";
const regex = /This is([^]*?)sentence/;
const m = s.match(regex);

if (m) {
  console.log(m[1]);
}

The lookaround variant in JavaScript is (?<=This is)[^]*?(?=sentence) and you could check Lookbehind in JS regular expressions for the support.

Also see Important Notes About Lookbehind.

const s = "This is just\na simple sentence";
const regex = /(?<=This is)[^]*?(?=sentence)/;
const m = s.match(regex);

if (m) {
  console.log(m[0]);
}

回复收藏 0 原文

心如狂蝶 2024-11-16 13:18:29

RegEx 使用 Java 方法匹配两个字符串之间的所有内容。

List<String> results = new ArrayList<>(); //For storing results
String example = "Code will save the world";

让我们使用 Pattern 和 Matcher 对象来使用 RegEx (.?)*。

Pattern p = Pattern.compile("Code "(.*?)" world");   //java.util.regex.Pattern;
Matcher m = p.matcher(example);                      //java.util.regex.Matcher;

由于 Matcher 可能包含多个匹配项，因此我们需要循环结果并存储它。

while(m.find()){   //Loop through all matches
   results.add(m.group()); //Get value and store in collection.
}

此示例将仅包含“将保存” 单词，但在较大的文本中，它可能会找到更多匹配项。

RegEx to match everything between two strings using the Java approach.

List<String> results = new ArrayList<>(); //For storing results
String example = "Code will save the world";

Let's use Pattern and Matcher objects to use RegEx (.?)*.

Pattern p = Pattern.compile("Code "(.*?)" world");   //java.util.regex.Pattern;
Matcher m = p.matcher(example);                      //java.util.regex.Matcher;

Since Matcher might contain more than one match, we need to loop over the results and store it.

while(m.find()){   //Loop through all matches
   results.add(m.group()); //Get value and store in collection.
}

This example will contain only "will save the" word, but in the bigger text it will probably find more matches.

回复收藏 0 原文

话少情深 2024-11-16 13:18:29

如果有人正在 Jenkins 上下文中寻找这样的示例。它会解析 build.log，如果找到匹配项，则该匹配项会使构建失败。

import java.util.regex.Matcher;
import java.util.regex.Pattern;

node{    
    stage("parse"){
        def file = readFile 'build.log'

        def regex = ~"(?s)(firstStringToUse(.*)secondStringToUse)"
        Matcher match = regex.matcher(file)
        match.find() {
            capturedText = match.group(1)
            error(capturedText)
        }
    }
}

In case anyone is looking for an example of this within a Jenkins context. It parses the build.log and if it finds a match it fails the build with the match.

import java.util.regex.Matcher;
import java.util.regex.Pattern;

node{    
    stage("parse"){
        def file = readFile 'build.log'

        def regex = ~"(?s)(firstStringToUse(.*)secondStringToUse)"
        Matcher match = regex.matcher(file)
        match.find() {
            capturedText = match.group(1)
            error(capturedText)
        }
    }
}

回复收藏 0 原文

尾戒 2024-11-16 13:18:29

有没有办法处理文本块中这种拆分的重复实例？例如：“这只是\一个简单的句子。这里有一些附加的东西。这只是\一个简单的句子。这里还有一些更多的东西。这只是\一个简单的句子。”。要匹配每个实例而不是整个字符串，请使用以下代码：

data = "This is just\na simple sentence. Here is some additional stuff. This is just\na simple sentence. And here is some more stuff. This is just\na simple sentence."

pattern = re.compile('This is (?s).*? sentence')

for match_instance in re.finditer(pattern, data):
    do_something(match_instance.group())

There is a way to deal with repeated instances of this split in a block of text? FOr instance: "This is just\na simple sentence. Here is some additional stuff. This is just\na simple sentence. And here is some more stuff. This is just\na simple sentence. ". to matches each instance instead of the entire string, use below code:

data = "This is just\na simple sentence. Here is some additional stuff. This is just\na simple sentence. And here is some more stuff. This is just\na simple sentence."

pattern = re.compile('This is (?s).*? sentence')

for match_instance in re.finditer(pattern, data):
    do_something(match_instance.group())

回复收藏 0 原文

超可爱的懒熊 2024-11-16 13:18:29

我是这样做的：
这对我来说比尝试找出所需的特定正则表达式更容易。

int indexPictureData = result.IndexOf("-PictureData:");
int indexIdentity = result.IndexOf("-Identity:");
string returnValue = result.Remove(indexPictureData + 13);
returnValue = returnValue + " [bytecoderemoved] " + result.Remove(0, indexIdentity); `

Here is how I did it:
This was easier for me than trying to figure out the specific regex necessary.

int indexPictureData = result.IndexOf("-PictureData:");
int indexIdentity = result.IndexOf("-Identity:");
string returnValue = result.Remove(indexPictureData + 13);
returnValue = returnValue + " [bytecoderemoved] " + result.Remove(0, indexIdentity); `

回复收藏 0 原文

一个人的旅程 2024-11-16 13:18:29

我在这里搜索正则表达式，以在旧脚本中的 Python2 中的 print“string”之间转换此打印语法：对于 Python3，使用 print(“string”)。效果很好，否则请使用 2to3.py 进行其他转换。这是我为其他人提供的解决方案：

在 Regexr.com 上尝试一下（由于某种原因在 NP++ 中不起作用）：

find:     (?<=print)( ')(.*)(')
replace: ('$2')

对于变量：

(?<=print)( )(.*)(\n)
('$2')\n

对于标签和变量：

(?<=print)( ')(.*)(',)(.*)(\n)
('$2',$4)\n

如何替换所有打印Python2 中的“string”与 Python3 中的 print(“string”) ？

I landed here on my search for regex to convert this print syntax between print "string", in Python2 in old scripts with: print("string"), for Python3. Works well, otherwise use 2to3.py for additional conversions. Here is my solution for others:

Try it out on Regexr.com (doesn't work in NP++ for some reason):

find:     (?<=print)( ')(.*)(')
replace: ('$2')

for variables:

(?<=print)( )(.*)(\n)
('$2')\n

for label and variable:

(?<=print)( ')(.*)(',)(.*)(\n)
('$2',$4)\n

How to replace all print "string" in Python2 with print("string") for Python3?

回复收藏 0 原文

默嘫て 2024-11-16 13:18:29

对于 python

def match_between_strings(text, start_str, end_str):
    pattern = re.escape(start_str) + r'(.*?)' + re.escape(end_str)
    matches = re.findall(pattern, text, re.DOTALL)
    return matches

示例用法：

start_str = "This"
end_str = "sentence"
text = "This is just\na simple sentence"

result = match_between_strings(text, start_str, end_str)

结果

[' is just\na simple ']

For python

def match_between_strings(text, start_str, end_str):
    pattern = re.escape(start_str) + r'(.*?)' + re.escape(end_str)
    matches = re.findall(pattern, text, re.DOTALL)
    return matches

Example usage:

start_str = "This"
end_str = "sentence"
text = "This is just\na simple sentence"

result = match_between_strings(text, start_str, end_str)

Result

[' is just\na simple ']

回复收藏 0 原文

有深☉意 2024-11-16 13:18:29

您可以用空字符串替换不需要的位，而不是提取所需的位。

在红宝石中，

"This is just\na simple sentence".gsub(/^This is|sentence\z/, '')
  #=> " just\na simple "

Rather than extract the bits you want you could replace the bits you don't want with empty strings.

In Ruby,

"This is just\na simple sentence".gsub(/^This is|sentence\z/, '')
  #=> " just\na simple "

回复收藏 0 原文

煮酒 2024-11-16 13:18:29

要在 VIM 中快速搜索，您可以使用
在 Vim 控制提示符处： /This is.*\_.*sentence

回复收藏 0 原文

此刻的回忆 2024-11-16 13:18:29

我有这个字符串

      headers:
        Date:
          schema:
            type: string
            example: Tue, 23 Aug 2022 11:36:23 GMT
        Content-Type:
          schema:
            type: string
            example: application/json; charset=utf-8
        Transfer-Encoding:
          schema:
            type: string
            example: chunked
        Connection:
          schema:
            type: string
            example: keep-alive
        Content-Encoding:
          schema:
            type: string
            example: gzip
        Vary:
          schema:
            type: string
            example: Accept-Encoding
        Server:
          schema:
            type: number
            example: Microsoft-IIS/10.0
        X-Powered-By:
          schema:
            type: string
            example: ASP.NET
        Access-Control-Allow-Origin:
          schema:
            type: string
            example: '*'
        Access-Control-Allow-Credentials:
          schema:
            type: boolean
            example: 'true'
        Access-Control-Allow-Headers:
          schema:
            type: string
            example: '*'
        Access-Control-Max-Age:
          schema:
            type: string
            example: '-1'
        Access-Control-Allow-Methods:
          schema:
            type: string
            example: GET, PUT, POST, DELETE
        X-Content-Type-Options:
          schema:
            type: string
            example: nosniff
        X-XSS-Protection:
          schema:
            type: string
            example: 1; mode=block
      content:
        application/json:

，我想删除从 headers: 到 content 的所有内容，所以我写了这个正则表达式 (headers:)[^]*?(content )

并且它按预期工作，查找该表达式出现了多少次。

i had this string

      headers:
        Date:
          schema:
            type: string
            example: Tue, 23 Aug 2022 11:36:23 GMT
        Content-Type:
          schema:
            type: string
            example: application/json; charset=utf-8
        Transfer-Encoding:
          schema:
            type: string
            example: chunked
        Connection:
          schema:
            type: string
            example: keep-alive
        Content-Encoding:
          schema:
            type: string
            example: gzip
        Vary:
          schema:
            type: string
            example: Accept-Encoding
        Server:
          schema:
            type: number
            example: Microsoft-IIS/10.0
        X-Powered-By:
          schema:
            type: string
            example: ASP.NET
        Access-Control-Allow-Origin:
          schema:
            type: string
            example: '*'
        Access-Control-Allow-Credentials:
          schema:
            type: boolean
            example: 'true'
        Access-Control-Allow-Headers:
          schema:
            type: string
            example: '*'
        Access-Control-Max-Age:
          schema:
            type: string
            example: '-1'
        Access-Control-Allow-Methods:
          schema:
            type: string
            example: GET, PUT, POST, DELETE
        X-Content-Type-Options:
          schema:
            type: string
            example: nosniff
        X-XSS-Protection:
          schema:
            type: string
            example: 1; mode=block
      content:
        application/json:

and i wanted to remove everything from the words headers: to content so I wrote this regex (headers:)[^]*?(content)

and it worked as expected finding how many times that expression has occurred.

回复收藏 0 原文

南风起 2024-11-16 13:18:29

Sublime Text 3x

在 Sublime Text 中，您只需写下您感兴趣的两个单词，例如在您的情况下，它是

“This is”和“sentence”

，然后在中间写 .*，即

This is .*entence< /code>

这应该对你有好处

回复收藏 0 原文

~没有更多了~

关于作者

檐上三寸雪

暂无简介

0 文章

0 评论

514 人气

关注发私信

相关话题

热门标签

操作系统程序设计 IT运维 Linux系统管理 JavaScript 服务器应用 solaris C/C++ PHP Shell BSD Vue.js aix Oracle Python HTML 系统管理 HTML5 CSS 前端

推荐作者

lorenzathorton8

文章 0 评论 0

Zero

文章 0 评论 0

萧瑟寒风

文章 0 评论 0

mylayout

文章 0 评论 0

tkewei

文章 0 评论 0

17818769742

文章 0 评论 0

友情链接

我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的隐私政策了解更多相关信息。单击 接受 或继续使用网站，即表示您同意使用 Cookies 和您的相关数据。

原文