如何使用正则表达式提取子字符串
我有一个字符串,其中有两个单引号,即 '
字符。单引号之间是我想要的数据。
如何编写正则表达式从以下文本中提取“我想要的数据”?
mydata = "some string with 'the data i want' inside";
I have a string that has two single quotes in it, the '
character. In between the single quotes is the data I want.
How can I write a regex to extract "the data i want" from the following text?
mydata = "some string with 'the data i want' inside";
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(14)
假设您想要单引号之间的部分,请将此正则表达式与
匹配器
:示例:
结果:
Assuming you want the part between single quotes, use this regular expression with a
Matcher
:Example:
Result:
为此,您不需要正则表达式。
将 apache commons lang 添加到您的项目中 (http://commons.apache.org/proper/commons- lang/),然后使用:
You don't need regex for this.
Add apache commons lang to your project (http://commons.apache.org/proper/commons-lang/), then use:
为此有一个简单的单行:
通过将匹配组设置为可选,这也可以满足在这种情况下通过返回空白而找不到引号的情况。
请参阅现场演示。
There's a simple one-liner for this:
By making the matching group optional, this also caters for quotes not being found by returning a blank in that case.
See live demo.
从 Java 9
版本开始,您可以使用新方法
Matcher::results
没有参数,能够轻松返回Stream
其中MatchResult
表示匹配操作的结果,并提供读取匹配组等内容(此类自 Java 1.5 起就已为人所知)。上面的代码片段的结果是:
与过程式
if (matcher.find())
和while (matcher.find())
相比,最大的优点是在有一个或多个结果可用时易于使用> 检查和处理。Since Java 9
As of this version, you can use a new method
Matcher::results
with no args that is able to comfortably returnStream<MatchResult>
whereMatchResult
represents the result of a match operation and offers to read matched groups and more (this class is known since Java 1.5).The code snippet above results in:
The biggest advantage is in the ease of usage when one or more results is available compared to the procedural
if (matcher.find())
andwhile (matcher.find())
checks and processing.因为您还勾选了 Scala,这是一个没有正则表达式的解决方案,可以轻松处理多个带引号的字符串:
Because you also ticked Scala, a solution without regex which easily deals with multiple quoted strings:
就像在 javascript 中一样:
实际的正则表达式是:
/'([^']+)'/
如果您使用非贪婪修饰符(根据另一篇文章),它就像这样:
它更干净。
as in javascript:
the actual regexp is:
/'([^']+)'/
if you use the non greedy modifier (as per another post) it's like this:
it is cleaner.
String dataIWant = mydata.split("'")[1];
请参阅现场演示
String dataIWant = mydata.split("'")[1];
See Live Demo
Apache Commons Lang 为 java.lang API 提供了许多帮助实用程序,其中最著名的是字符串操作方法。
在您的情况下,开始和结束子字符串是相同的,因此只需调用以下函数即可。
如果开始和结束子字符串不同,则使用以下重载方法。
如果您想要匹配子字符串的所有实例,请使用,
对于所讨论的示例,获取匹配子字符串的所有实例
Apache Commons Lang provides a host of helper utilities for the java.lang API, most notably String manipulation methods.
In your case, the start and end substrings are the same, so just call the following function.
If the start and the end substrings are different then use the following overloaded method.
If you want all instances of the matching substrings, then use,
For the example in question to get all instances of the matching substring
在斯卡拉中,
In Scala,
添加 apache.commons 对您的 pom.xml 的依赖
,下面的代码可以工作。
add apache.commons dependency on your pom.xml
And below code works.
你可以用这个
我使用 while 循环将所有匹配子字符串存储在数组中
如果您使用
if (matcher.find())
{
System.out.println(matcher.group(1));
}
您将获得匹配子字符串,因此您可以使用它来获取所有匹配子字符串
you can use this
i use while loop to store all matches substring in the array if you use
if (matcher.find())
{
System.out.println(matcher.group(1));
}
you will get on matches substring so you can use this to get all matches substring
一些小组(1)对我不起作用。我使用 group(0) 来查找 url 版本。
Some how the group(1) didnt work for me. I used group(0) to find the url version.