用Java Regex提取文本并提供参考支持
我在Java中写了一种方法。它基本上是提取具有匹配模式的文本,并返回 all 提取。它只是像java.util.regex.matcher
's find()
/matches()
然后group> group()< /code>:
Matcher matcher = pattern.matcher(fileContent);
StringBuilder sb = new StringBuilder();
while(matcher.matches()) {
sb.append(matcher.group()).append("\n");
}
return sb.toString();
但是,我希望提取的摘录以参考文献(美元符号,$
)和文字character-eScaping(BackSlash,> \
),支持,支持就像matcher.replaceall(替换)
中的替换一样( doc )。例如:
fileContent = """
aaabbcac aabb
bcbcbbccc babba
""";
pattern = Pattern.compile("bb.*(.)(abb)");
extractionFormatter = "$1: $0, \\$$2";
预期的输出是:
a: bbcac aabb, $abb
b: bbccc babb, $abb
希望您了解我要做的事情。您知道是否有任何现有的库/方法可以实现这一目标而无需我重新发明轮子?
I am having some problem writing a method in Java. It basically extracts text with matching pattern and returns ALL the extractions. It simply works just like java.util.regex.Matcher
's find()
/matches()
then group()
:
Matcher matcher = pattern.matcher(fileContent);
StringBuilder sb = new StringBuilder();
while(matcher.matches()) {
sb.append(matcher.group()).append("\n");
}
return sb.toString();
However, I would like the extractions to be formatted with the references(dollar sign,$
) and literal-character-escaping (backslash,\
) support, just like the replacement in Matcher.replaceAll(replacement)
(Doc). For example:
fileContent = """
aaabbcac aabb
bcbcbbccc babba
""";
pattern = Pattern.compile("bb.*(.)(abb)");
extractionFormatter = "$1: $0, \\$2";
The expected output would be:
a: bbcac aabb, $abb
b: bbccc babb, $abb
I hope you understand what I am trying to do. Do you know if there is any existing library/method that can achieve this without having me to reinvent the wheel?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以从
Matcher
类中使用结果
方法,该类将返回MatchResult
s的流首先获得所有匹配项,使用使用结果matchResult.group
,立即使用方法string.replaceall
使用模式AS REGEX和您的fractactionFormatter
作为替换,并最终使用新行加入了所有行:You can use the
results
method from theMatcher
class which returns a stream ofMatchResult
s to first get all matches, get the results as string usingMatchResult.group
, replace now using the methodString.replaceAll
using the pattern as regex and yourextractionFormatter
as replacement and finally join all using new line:您可以使用 string.replaceall 。
要注意的是,如果您想通过捕获组获取所需的输出,则必须从替换中不应存在的字符串中匹配(删除)。
使用将提供所需输出的模式:
输出
请参见a java demo 。
或使用StringBuilder,Matcher和While循环:
请参阅A java demo 。
You can use String.replaceAll instead.
The thing to note is that if you want to get the desired output with capture groups, you would have to match (to remove) from the string that should not be there in the replacement.
Using a pattern that would give the desired output:
Output
See a Java demo.
Or using the Stringbuilder, Matcher and the while loop:
See a Java demo.