当前位置：文江博客话题详情

正则表达式也有某些字母和一组特定字母中的至少一个

发布于 2024-11-10 11:26:28 字数 136 浏览 5 评论 0原文

有人可以帮助我使用正则表达式语句来查找使用此规则的语句吗？

该单词需要包含字母“J、U、G”（只是字母而不是顺序）和至少其中一个字母：G、L、E、R、S

所以我可以搜索壶、杂耍者的列表，杂耍者、杂耍者等。

谢谢

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

云雾 2024-11-17 11:26:28

还有一个正则表达式解决方案。但你确实应该给出你正在使用的语言，因为正如 @Quick Joe Smith 所写，可能还有其他更好的解决方案来完成你的任务。

^(?=.*J)(?=.*U)(?=.*G)(?=.*[LERS]).*$

请参阅 Rubular

那些 (?=) 是积极的展望，他们检查字符串中是否存在该字符，但它们不匹配。末尾的 .* 将匹配您的完整字符串。

您还需要修饰符i来打开ignorecase（不区分大小写）

There is also a regex solution. But you should really give the language you are using because there can be other maybe better solutions for your task as @Quick Joe Smith wrote.

^(?=.*J)(?=.*U)(?=.*G)(?=.*[LERS]).*$

See on Rubular

Those (?=) are positive look aheads, they check if there is the character in the string but they don't match them. The .* at the end will then match your complete string.

You also need the modifier i to turn on ignorecase (case insensitive)

回复收藏 0 原文

不寐倦长更 2024-11-17 11:26:28

你的问题的第一部分根本不适合正则表达式。该模式最终会变得一团混乱，并且当您添加更多所需字符时只会变得更糟。

然而，第二部分是微不足道的：

m/[glers]/i

所以我建议分两部分实施解决方案。这取决于您的语言：

C#（使用 Linq）

var chars = "GJU"; // characters are sorted.
if (inputstring.ToUpper().Intersect(chars).OrderBy(c => c).SequenceEqual(chars)) {
    // do stuff if match.
}

Perl（需要 5.10）

my @chars = sort split '', 'GJU'; # Transform into sorted array.
my %input = map{($_, 1)} split '', uc $inputstring; # stores unique chars from string.
if (@chars ~~ %input) { # Smart match performs hash key intersection.
    # Do stuff in here.
}

Python

chars = set('jug')
input = set(inputstring)
if chars == (chars & input):
    # do something here

The first part of your question does not lend itself to regular expressions very well at all. The pattern will end up a convoluted mess, and only get worse as you add more required characters.

The second part, however, is trivial:

m/[glers]/i

So I would suggest implementing a solution in two parts. This depends on your language:

C# (using Linq)

var chars = "GJU"; // characters are sorted.
if (inputstring.ToUpper().Intersect(chars).OrderBy(c => c).SequenceEqual(chars)) {
    // do stuff if match.
}

Perl (requires 5.10)

my @chars = sort split '', 'GJU'; # Transform into sorted array.
my %input = map{($_, 1)} split '', uc $inputstring; # stores unique chars from string.
if (@chars ~~ %input) { # Smart match performs hash key intersection.
    # Do stuff in here.
}

Python

chars = set('jug')
input = set(inputstring)
if chars == (chars & input):
    # do something here

回复收藏 0 原文

打小就很酷 2024-11-17 11:26:28

如果您一次只处理一个单词，请尝试以下操作：

boolean isMatch = s.matches(
  "(?i)^(?:J()|U()|G(?!.*G)()|[GLERS]()|\\w){4,}+$\\1\\2\\3\\4");

如果您要在较长的字符串中搜索匹配项：

Pattern p = Pattern.compile(
    "(?i)\\b(?:J()|U()|G(?!.*G)()|[GLERS]()|\\w){4,}+\\b\\1\\2\\3\\4");
Matcher m = p.matcher(s);
while (m.find()) {
    String foundString = m.group();
}

每次选择前四个选项之一 - J(), U()、G() 或 [GLERS]() - 匹配某些内容，其后面的空组“捕获”任何内容（即空字符串））。当到达字符串末尾时，每个反向引用 - \1、\2 等 - 尝试匹配其相应组匹配的相同内容：不再有任何内容。

显然，这总会成功；你总是可以匹配注释。诀窍在于，如果其对应的组没有参与匹配，则反向引用甚至不会尝试进行匹配。也就是说，如果目标字符串中没有 j，则 J() 替代方案中的 () 永远不会参与。当正则表达式引擎稍后处理 \1 反向引用时，它会立即报告失败，因为它知道该组尚未参与匹配。

通过这种方式，空组就像一个复选框，并且反向引用确保所有复选框都已被选中。不过，还有一个问题。 G() 和 [GLERS]() 替代项都可以匹配 g；当你需要的时候，你如何确保他们都参加比赛？我尝试的第一个正则表达式

"(?i)^(?:J()|U()|G()|[GLERS]()|\\w){4,}+$\\1\\2\\3\\4"

...无法匹配单词“jugg”，因为 G() 替代方案消耗了两个 g； [GLERS]() 从未有机会参与。因此，我添加了否定前瞻 - (?!.*G) - 现在它只匹配 last g。如果我有三个可以匹配 g 的替代方案，我必须将 (?!.*G.*G) 添加到第一个，然后将 (? !.*G) 到第二个。但实际上，在达到这一点之前，我可能会转向另一种方法（可能不涉及正则表达式）。 ;)

If you're working with one word at a time, try this:

boolean isMatch = s.matches(
  "(?i)^(?:J()|U()|G(?!.*G)()|[GLERS]()|\\w){4,}+$\\1\\2\\3\\4");

If you're searching for matches in a longer string:

Pattern p = Pattern.compile(
    "(?i)\\b(?:J()|U()|G(?!.*G)()|[GLERS]()|\\w){4,}+\\b\\1\\2\\3\\4");
Matcher m = p.matcher(s);
while (m.find()) {
    String foundString = m.group();
}

Each time one of the first four alternatives - J(), U(), G() or [GLERS]() - matches something, the empty group following it "captures" nothing (i.e., an empty string). When the end of the string is reached, each of the backreferences - \1, \2, etc. - tries to match the same thing its corresponding group matched: nothing again.

Obviously, that will always succeed; you can always match noting. The trick is that the backreference won't even try to match if its corresponding group didn't participate in the match. That is, if there's no j in the target string, the () in the J() alternative never gets involved. When the regex engine processes the \1 backreference later, it immediately reports failure because it knows that group hasn't participated in the match.

In this way, the empty groups act like a check boxes, and the backreferences make sure all the boxes have been checked. There's one wrinkle, though. Both the G() and [GLERS]() alternatives can match g; how do you make sure they both participate in the match when you need them to? The first regex I tried,

"(?i)^(?:J()|U()|G()|[GLERS]()|\\w){4,}+$\\1\\2\\3\\4"

...failed to match the word "jugg" because the G() alternative was consuming both g's; [GLERS]() never got a chance to participate. So I added the negative lookahead - (?!.*G) - and now it only matches the last g. If I had three alternatives that could match a g, I would have to add (?!.*G.*G) to the first one and (?!.*G) to the second. But realistically, I probably would have switched to a different approach (probably one not involving regexes) well before I reached that point. ;)