正则表达式:有 AND 运算符吗?

发布于 2024-07-11 15:22:28 字数 124 浏览 8 评论 0原文

显然,你可以使用|(管道?)来表示OR,但是有没有办法也表示AND呢?

具体来说,我想匹配包含所有特定短语的文本段落,但没有特定的顺序。

Obviously, you can use the | (pipe?) to represent OR, but is there a way to represent AND as well?

Specifically, I'd like to match paragraphs of text that contain ALL of a certain phrase, but in no particular order.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(15

感性 2024-07-18 15:22:28

使用非消耗性正则表达式。

典型的(即 Perl/Java)表示法是:

(?=expr)

这意味着“匹配 expr 但之后在原来的赛点继续比赛。”

您可以根据需要执行任意多个操作,这将是一个“和”。 示例:

(?=匹配这个表达式)(?=也匹配这个)(?=哦,还有这个)

如果您需要保存一些捕获组,您甚至可以在非消耗表达式中添加捕获组其中的数据。

Use a non-consuming regular expression.

The typical (i.e. Perl/Java) notation is:

(?=expr)

This means "match expr but after that continue matching at the original match-point."

You can do as many of these as you want, and this will be an "and." Example:

(?=match this expression)(?=match this too)(?=oh, and this)

You can even add capture groups inside the non-consuming expressions if you need to save some of the data therein.

缱倦旧时光 2024-07-18 15:22:28

正如其他一些响应者所说,您需要使用前瞻,但前瞻必须考虑其目标单词和当前匹配位置之间的其他字符。 例如:

(?=.*word1)(?=.*word2)(?=.*word3)

第一个前瞻中的 .* 让它匹配在到达“word1”之前所需的任意数量的字符。 然后重置匹配位置,第二次前瞻查找“word2”。 再次重置,最后部分匹配“word3”; 由于它是您要检查的最后一个单词,因此不必将其放在前瞻中,但这并没有什么坏处。

为了匹配整个段落,您需要在两端锚定正则表达式,并添加最后的 .* 来消耗剩余的字符。 使用 Perl 风格的表示法,那就是:

/^(?=.*word1)(?=.*word2)(?=.*word3).*$/m

'm' 修饰符用于多行模式; 它允许 ^$ 在段落边界(正则表达式中的“行边界”)处匹配。 在这种情况下,不要使用's'修饰符至关重要,它可以让点元字符匹配换行符以及所有其他字符。

最后,您要确保匹配整个单词,而不仅仅是较长单词的片段,因此您需要添加单词边界:

/^(?=.*\bword1\b)(?=.*\bword2\b)(?=.*\bword3\b).*$/m

You need to use lookahead as some of the other responders have said, but the lookahead has to account for other characters between its target word and the current match position. For example:

(?=.*word1)(?=.*word2)(?=.*word3)

The .* in the first lookahead lets it match however many characters it needs to before it gets to "word1". Then the match position is reset and the second lookahead seeks out "word2". Reset again, and the final part matches "word3"; since it's the last word you're checking for, it isn't necessary that it be in a lookahead, but it doesn't hurt.

In order to match a whole paragraph, you need to anchor the regex at both ends and add a final .* to consume the remaining characters. Using Perl-style notation, that would be:

/^(?=.*word1)(?=.*word2)(?=.*word3).*$/m

The 'm' modifier is for multline mode; it lets the ^ and $ match at paragraph boundaries ("line boundaries" in regex-speak). It's essential in this case that you not use the 's' modifier, which lets the dot metacharacter match newlines as well as all other characters.

Finally, you want to make sure you're matching whole words and not just fragments of longer words, so you need to add word boundaries:

/^(?=.*\bword1\b)(?=.*\bword2\b)(?=.*\bword3\b).*$/m
柳絮泡泡 2024-07-18 15:22:28

看这个例子:

我们有 2 个正则表达式 A 和 B,我们想要匹配它们,所以在伪代码中它看起来像这样:

pattern = "/A AND B/"

可以在不使用 AND 运算符的情况下编写它,如下所示:

pattern = "/NOT (NOT A OR NOT B)/"

在 PCRE 中:

"/(^(^A|^B))/"

regexp_match(pattern,data)

Look at this example:

We have 2 regexps A and B and we want to match both of them, so in pseudo-code it looks like this:

pattern = "/A AND B/"

It can be written without using the AND operator like this:

pattern = "/NOT (NOT A OR NOT B)/"

in PCRE:

"/(^(^A|^B))/"

regexp_match(pattern,data)
如歌彻婉言 2024-07-18 15:22:28

AND 运算符在 RegExp 语法中隐式
OR 运算符必须用管道指定。
以下正则表达式:

var re = /ab/;

表示字母 a AND 字母 b
它也适用于组:

var re = /(co)(de)/;

这意味着组 co ANDde
用 OR 替换(隐式)AND 需要以下行:

var re = /a|b/;
var re = /(co)|(de)/;

The AND operator is implicit in the RegExp syntax.
The OR operator has instead to be specified with a pipe.
The following RegExp:

var re = /ab/;

means the letter a AND the letter b.
It also works with groups:

var re = /(co)(de)/;

it means the group co AND the group de.
Replacing the (implicit) AND with an OR would require the following lines:

var re = /a|b/;
var re = /(co)|(de)/;
风柔一江水 2024-07-18 15:22:28

您可以使用正则表达式来做到这一点,但您可能需要使用其他表达式。 例如,使用多个正则表达式并将它们组合在 if 子句中。

您可以使用标准正则表达式枚举所有可能的排列,如下所示(以任何顺序匹配 a、b 和 c):

(abc)|(bca)|(acb)|(bac)|(cab)|(cba)

但是,如果您有多个术语,这会导致非常长且可能效率低下的正则表达式。

如果您使用某些扩展的正则表达式版本,例如 Perl 或 Java,他们有更好的方法来做到这一点。 其他答案建议使用正向先行操作。

You can do that with a regular expression but probably you'll want to some else. For example use several regexp and combine them in a if clause.

You can enumerate all possible permutations with a standard regexp, like this (matches a, b and c in any order):

(abc)|(bca)|(acb)|(bac)|(cab)|(cba)

However, this makes a very long and probably inefficient regexp, if you have more than couple terms.

If you are using some extended regexp version, like Perl's or Java's, they have better ways to do this. Other answers have suggested using positive lookahead operation.

凝望流年 2024-07-18 15:22:28

在您的情况下,是否不可能对多个匹配结果进行 AND 操作? 在伪代码中

regexp_match(pattern1, data) && regexp_match(pattern2, data) && ...

Is it not possible in your case to do the AND on several matching results? in pseudocode

regexp_match(pattern1, data) && regexp_match(pattern2, data) && ...
情何以堪。 2024-07-18 15:22:28

为什么不使用 awk?
使用 awk regex AND, OR 事情就这么简单

awk '/WORD1/ && /WORD2/ && /WORD3/' myfile

Why not use awk?
with awk regex AND, OR matters is so simple

awk '/WORD1/ && /WORD2/ && /WORD3/' myfile
泪是无色的血 2024-07-18 15:22:28

该顺序始终隐含在正则表达式的结构中。 为了实现您想要的目的,您必须将输入字符串与不同的表达式进行多次匹配。

您想要做的事情是用单个正则表达式不可能实现的。

The order is always implied in the structure of the regular expression. To accomplish what you want, you'll have to match the input string multiple times against different expressions.

What you want to do is not possible with a single regexp.

枯叶蝶 2024-07-18 15:22:28

如果您使用 Perl 正则表达式,则可以使用正向前瞻:

例如

(?=[1-9][0-9]{2})[0-9]*[05]\b

大于 100 并能被 5 整除的数字

If you use Perl regular expressions, you can use positive lookahead:

For example

(?=[1-9][0-9]{2})[0-9]*[05]\b

would be numbers greater than 100 and divisible by 5

纵性 2024-07-18 15:22:28

除了已接受的答案之外,

我还将向您提供一些实际示例,让一些人更清楚地了解事情。 例如,假设我们有这三行文本:

[12/Oct/2015:00:37:29 +0200] // only this + will get selected
[12/Oct/2015:00:37:x9 +0200]
[12/Oct/2015:00:37:29 +020x]

在此处查看演示 DEMO

我们在这里要做的是选择+号,但前提是它位于两个带有空格的数字之后并且位于四个数字之前。 这些是唯一的限制。 我们将使用这个正则表达式来实现它:

'~(?<=\d{2} )\+(?=\d{4})~g'

请注意,如果您分开表达式,它将给出不同的结果。

或者您可能想选择标签之间的一些文本...但不是标签! 然后您可以使用:

'~(?<=<p>).*?(?=<\/p>)~g'

对于此文本:

<p>Hello !</p> <p>I wont select tags! Only text with in</p> 

在此处查看演示 DEMO

In addition to the accepted answer

I will provide you with some practical examples that will get things more clear to some of You. For example lets say we have those three lines of text:

[12/Oct/2015:00:37:29 +0200] // only this + will get selected
[12/Oct/2015:00:37:x9 +0200]
[12/Oct/2015:00:37:29 +020x]

See demo here DEMO

What we want to do here is to select the + sign but only if it's after two numbers with a space and if it's before four numbers. Those are the only constraints. We would use this regular expression to achieve it:

'~(?<=\d{2} )\+(?=\d{4})~g'

Note if you separate the expression it will give you different results.

Or perhaps you want to select some text between tags... but not the tags! Then you could use:

'~(?<=<p>).*?(?=<\/p>)~g'

for this text:

<p>Hello !</p> <p>I wont select tags! Only text with in</p> 

See demo here DEMO

拍不死你 2024-07-18 15:22:28

您可以将输出传输到另一个正则表达式。 使用 grep,您可以执行以下

操作:grep A | grep B

You could pipe your output to another regex. Using grep, you could do this:

grep A | grep B

神仙妹妹 2024-07-18 15:22:28

((yes).*(no))|((no).*(yes))

将匹配同时具有 yesno 的句子同时,无论它们出现的顺序如何:

我喜欢cookie吗? **是的,我愿意。 但是牛奶 - **不**,绝对不。

**不**,你可能没有我的手机。 **是的**,你可以自己去。

两者都会匹配,忽略大小写。

((yes).*(no))|((no).*(yes))

Will match sentence having both yes and no at the same time, regardless the order in which they appear:

Do i like cookies? **Yes**, i do. But milk - **no**, definitely no.

**No**, you may not have my phone. **Yes**, you may go f yourself.

Will both match, ignoring case.

人生百味 2024-07-18 15:22:28

在正则表达式之外使用 AND。 在 PHP 中,先行运算符似乎对我不起作用,相反,我使用了这个

if( preg_match("/^.{3,}$/",$pass1) && !preg_match("/\s{1}/",$pass1))
    return true;
else
    return false;

如果密码长度为 3 个字符或更多并且密码中没有空格,则上述正则表达式将匹配。

Use AND outside the regular expression. In PHP lookahead operator did not not seem to work for me, instead I used this

if( preg_match("/^.{3,}$/",$pass1) && !preg_match("/\s{1}/",$pass1))
    return true;
else
    return false;

The above regex will match if the password length is 3 characters or more and there are no spaces in the password.

只是偏爱你 2024-07-18 15:22:28

以下是“and”运算符的可能“形式”:

以以下正则表达式为例:

如果我们想要匹配不带“e”字符的单词,我们可以这样做:

/\b[^\We]+\b/g
  • \W 表示 NOT一个“字”字。
  • ^\W 表示“单词”字符。
  • [^\We] 表示“word”字符,但不是“e”。

看看它的实际效果:不带 e 的单词

”和“正则表达式运算符

我认为这种模式可以用作“ and”正则表达式的运算符。

一般来说,如果:

  • A = not a
  • B = not b

then:

[^AB] = not(A or B) 
      = not(A) and not(B) 
      = a and b

Difference Set

所以,如果我们要实现 正则表达式中的差异集,我们可以这样做:

a - b = a and not(b)
      = a and B
      = [^Ab]

Here is a possible "form" for "and" operator:

Take the following regex for an example:

If we want to match words without the "e" character, we could do this:

/\b[^\We]+\b/g
  • \W means NOT a "word" character.
  • ^\W means a "word" character.
  • [^\We] means a "word" character, but not an "e".

see it in action: word without e

"and" Operator for Regular Expressions

I think this pattern can be used as an "and" operator for regular expressions.

In general, if:

  • A = not a
  • B = not b

then:

[^AB] = not(A or B) 
      = not(A) and not(B) 
      = a and b

Difference Set

So, if we want to implement the concept of difference set in regular expressions, we could do this:

a - b = a and not(b)
      = a and B
      = [^Ab]
等你爱我 2024-07-18 15:22:28

常见情况:

在 javascript 中,您可以这样做:

如果您想检查密码是否同时包含小写字母和大写字母,请使用:

passwordValue.search(/[az]/) !== -1 &&< /strong> passwordValue.search(/[AZ]/) !== -1

如果密码输入包含同时小写字母和大写字母,则此语句返回true,否则返回错误的。

Common Situation:

In javascript You can do this:

If you wanna check if a password contains both miniscule and majuscule letters, use this:

passwordValue.search(/[a-z]/) !== -1 && passwordValue.search(/[A-Z]/) !== -1

This statement returns true if the password input contains both miniscule and majuscule letters, otherwise it returns false.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文