正则表达式未提取确切的模式

发布于 2025-01-07 14:46:51 字数 788 浏览 1 评论 0原文

我正在用 Java 读取超过 100000 个字符的字符串。 我有一个关键字列表,我会搜索字符串,如果该字符串存在,我会调用一个进行一些内部处理的函数。

例如,我拥有的关键字是“face” - 我希望获得与“faces”而不是“facebook”匹配的所有模式。我可以接受字符串中脸部后面的空格字符,因此如果在字符串中我有一个像“face”或“faces”或“face”或“faces”这样的匹配项,我也可以接受。但是我不能接受“duckface”或“duckface”等。

我已经编写了正则表达式

Pattern p = Pattern.compile("\\s+"+keyword+"s\\s+|\\s+");

,其中关键字是我的关键字列表,但我没有得到所需的结果。您能阅读我的描述并请建议可能出现的问题以及如何解决它吗?

另外,如果共享一个非常好的 Java 页面正则表达式的指针,我也会很感激。

谢谢贡献者..

编辑

我知道它不起作用的原因是我使用了以下代码:

Pattern p = Pattern.compile("\\s+"+keyword+"s\\s+|\\s+");
            Matcher m = p.matcher(myInputDataSting);
            if(m.find())
            {
                System.out.println("Its a Match: "+m.group());
}

这将返回一个空白字符串...

I am working in Java to read a string of over 100000 characters.
I have a list of keywords, that I search the string for, and if the string is present I call a function which does some internal processing.

The kind of keyword I have is "face", for example - I wish to get all the patterns where I have matches for "faces" not "facebook". I can accept a space character behind the face in the string so if in a string I have a match like " face" or " faces" or "face " or " faces" i can accept that too. However I can not accept "duckface" or "duckface " etc.

I have written the regex

Pattern p = Pattern.compile("\\s+"+keyword+"s\\s+|\\s+");

where keyword is my list of keywords, but I am not getting the desired results. Can you read my description and please suggest what might be issue and how I can fix it?

Also if a pointer to a really good regex for Java page is shared I would appreciate that as well.

Thank you Contributers ..

Edit

The reason I know it is not working is I have used the following code:

Pattern p = Pattern.compile("\\s+"+keyword+"s\\s+|\\s+");
            Matcher m = p.matcher(myInputDataSting);
            if(m.find())
            {
                System.out.println("Its a Match: "+m.group());
}

This returns a blank string...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

翻了热茶 2025-01-14 14:46:51

如果keyword“face”,那么您当前的正则表达式

\s+faces\s+|\s+

匹配一个或多个空白字符,后跟faces code>,后跟一个或多个空白字符, 一个或多个空白字符。 (管道 | 的优先级非常低。)

您真正想要的是

\bfaces?\b

匹配单词边界,后跟 face,可选地后跟 s ,后跟单词边界。

所以,你可以写:(

Pattern p = Pattern.compile("\\b"+keyword+"s?\\b");

尽管显然这仅适用于像 face 这样的单词,只需添加 s 即可形成复数形式)。

您可以在 http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html,但这并不是一个教程。为此,我建议仅谷歌搜索“正则表达式教程”,然后找到适合您的。 (它不必是特定于 Java 的:您会发现的大多数教程都是针对与 Java 非常相似的正则表达式风格。)

If keyword is "face", then your current regex is

\s+faces\s+|\s+

which matches either one or more whitespace characters, followed by faces, followed by one or more whitespace characters, or one or more whitespace characters. (The pipe | has very low precedence.)

What you really want is

\bfaces?\b

which matches a word boundary, followed by face, optionally followed by s, followed by a word boundary.

So, you can write:

Pattern p = Pattern.compile("\\b"+keyword+"s?\\b");

(though obviously this will only work for words like face that form their plurals by simply adding s).

You can find a comprehensive listing of Java's regular-expression support at http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html, but it's not much of a tutorial. For that, I'd recommend just Googling "regular expression tutorial", and finding one that suits you. (It doesn't have to be Java-specific: most of the tutorials you'll find are for flavors of regular-expression that are very similar to Java's.)

落日海湾 2025-01-14 14:46:51

你应该使用

模式 p = Pattern.compile("\b"+关键字+"s?\b");

,其中关键字不是复数。 \\b 表示关键字必须是搜索字符串中的完整单词。是?表示关键字的值可能以 s 结尾。

如果您对正则表达式不够熟悉,我建议您阅读 http://docs .oracle.com/javase/tutorial/essential/regex/index.html,因为有示例和解释。

You should use

Pattern p = Pattern.compile("\b"+keyword+"s?\b");

, where keyword is not plural. \\b means that keyword must be as a complete word in searched string. s? means that keyword's value may end with s.

If you are not familar enough with regular expressions I recommend reading http://docs.oracle.com/javase/tutorial/essential/regex/index.html, because there are examples and explanations.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文