C# - 正则表达式匹配整个单词

发布于 2024-11-01 11:13:37 字数 378 浏览 1 评论 0原文

我需要匹配包含给定字符串的所有整个单词。

string s = "ABC.MYTESTING
XYZ.YOUTESTED
ANY.TESTING";

Regex r = new Regex("(?<TM>[!\..]*TEST.*)", ...);
MatchCollection mc = r.Matches(s);

我需要的结果是:

MYTESTING
YOUTESTED
TESTING

但我得到:

TESTING
TESTED
.TESTING

如何使用正则表达式实现此目的。

编辑:扩展示例字符串。

I need to match all the whole words containing a given a string.

string s = "ABC.MYTESTING
XYZ.YOUTESTED
ANY.TESTING";

Regex r = new Regex("(?<TM>[!\..]*TEST.*)", ...);
MatchCollection mc = r.Matches(s);

I need the result to be:

MYTESTING
YOUTESTED
TESTING

But I get:

TESTING
TESTED
.TESTING

How do I achieve this with Regular expressions.

Edit: Extended sample string.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

月依秋水 2024-11-08 11:13:37

如果您正在查找包括“TEST”在内的所有单词,则应该使用

@"(?<TM>\w*TEST\w*)"

\w 包含单词字符并且是 [A-Za-z0-9_] 的缩写

If you were looking for all words including 'TEST', you should use

@"(?<TM>\w*TEST\w*)"

\w includes word characters and is short for [A-Za-z0-9_]

妥活 2024-11-08 11:13:37

保持简单:为什么不尝试使用 \w*TEST\w* 作为匹配模式。

Keep it simple: why not just try \w*TEST\w* as the match pattern.

最美的太阳 2024-11-08 11:13:37

我得到了您所期望的结果:

string s = @"ABC.MYTESTING
XYZ.YOUTESTED
ANY.TESTING";

var m = Regex.Matches(s, @"(\w*TEST\w*)", RegexOptions.IgnoreCase);

I get the results you are expecting with the following:

string s = @"ABC.MYTESTING
XYZ.YOUTESTED
ANY.TESTING";

var m = Regex.Matches(s, @"(\w*TEST\w*)", RegexOptions.IgnoreCase);
你的往事 2024-11-08 11:13:37

尝试使用 \b。它是非单词分隔符的正则表达式标志。如果您想匹配这两个单词,您可以使用:

/\b[a-z]+\b/i

顺便说一句,.net 不需要周围的 /,而 i 只是一个不区分大小写的匹配标志。

.NET 替代方案:

var re = new Regex(@"\b[a-z]+\b", RegexOptions.IgnoreCase);

Try using \b. It's the regex flag for a non-word delimiter. If you wanted to match both words you could use:

/\b[a-z]+\b/i

BTW, .net doesn't need the surrounding /, and the i is just a case-insensitive match flag.

.NET Alternative:

var re = new Regex(@"\b[a-z]+\b", RegexOptions.IgnoreCase);
没︽人懂的悲伤 2024-11-08 11:13:37

使用组我认为你可以实现它。

        string s = @"ABC.TESTING
        XYZ.TESTED";
        Regex r = new Regex(@"(?<TM>[!\..]*(?<test>TEST.*))", RegexOptions.Multiline);
        var mc= r.Matches(s);
        foreach (Match match in mc)
        {
            Console.WriteLine(match.Groups["test"]);
        }

完全按照您想要的方式工作。

顺便说一句,您的正则表达式模式应该是逐字字符串(@“”)

Using Groups I think you can achieve it.

        string s = @"ABC.TESTING
        XYZ.TESTED";
        Regex r = new Regex(@"(?<TM>[!\..]*(?<test>TEST.*))", RegexOptions.Multiline);
        var mc= r.Matches(s);
        foreach (Match match in mc)
        {
            Console.WriteLine(match.Groups["test"]);
        }

Works exactly like you want.

BTW, your regular expression pattern should be a verbatim string ( @"")

夏雨凉 2024-11-08 11:13:37
Regex r = new Regex(@"(?<TM>[^.]*TEST.*)", RegexOptions.IgnoreCase);

首先,正如 @manojlds 所说,您应该尽可能使用正则表达式的逐字字符串。否则,您必须在大多数正则表达式转义序列中使用两个反斜杠,而不仅仅是一个(例如 [!\\..]*)。

其次,如果您想匹配除点之外的任何内容,则正则表达式的该部分应为 [^.]*^ 是反转字符类的元字符,而不是 !,并且 . 在该上下文中没有特殊含义,因此不需要被逃脱。但您可能应该使用 \w* 代替,甚至 [AZ]*,具体取决于您所说的“单词”的确切含义。 [!\..] 匹配 !.

Regex r = new Regex(@"(?<TM>[A-Z]*TEST[A-Z]*)", RegexOptions.IgnoreCase);

这样你就不需要担心单词边界,尽管它们不会造成伤害:

Regex r = new Regex(@"(?<TM>\b[A-Z]*TEST[A-Z]*\b)", RegexOptions.IgnoreCase);

最后,如果你总是获取整个匹配,则不需要使用捕获组:

Regex r = new Regex(@"\b[A-Z]*TEST[A-Z]*\b", RegexOptions.IgnoreCase);

匹配的文本将可用通过 Match 的 Value 属性。

Regex r = new Regex(@"(?<TM>[^.]*TEST.*)", RegexOptions.IgnoreCase);

First, as @manojlds said, you should use verbatim strings for regexes whenever possible. Otherwise you'll have to use two backslashes in most of your regex escape sequences, not just one (e.g. [!\\..]*).

Second, if you want to match anything but a dot, that part of the regex should be [^.]*. ^ is the metacharacter that inverts the character class, not !, and . has no special meaning in that context, so it doesn't need to be escaped. But you should probably use \w* instead, or even [A-Z]*, depending on what exactly you mean by "word". [!\..] matches ! or ..

Regex r = new Regex(@"(?<TM>[A-Z]*TEST[A-Z]*)", RegexOptions.IgnoreCase);

That way you don't need to bother with word boundaries, though they don't hurt:

Regex r = new Regex(@"(?<TM>\b[A-Z]*TEST[A-Z]*\b)", RegexOptions.IgnoreCase);

Finally, if you're always taking the whole match anyway, you don't need to use a capturing group:

Regex r = new Regex(@"\b[A-Z]*TEST[A-Z]*\b", RegexOptions.IgnoreCase);

The matched text will be available via Match's Value property.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文