PHP 反向 Preg_match

发布于 2024-11-02 05:36:26 字数 287 浏览 3 评论 0原文

if(preg_match("/" . $filter . "/i", $node)) {
    echo $node;
}

此代码过滤变量以决定是否显示它。 $filter 的示例条目为“office”或“164(.*)976”。

我想知道是否有一个简单的方法来表达:如果$filter在$node中不匹配。以正则表达式的形式?

那么...不是“if(!preg_match”),而是更多的 $filter =“!office”或“!164(.*)976”,但一个有效的?

if(preg_match("/" . $filter . "/i", $node)) {
    echo $node;
}

This code filters a variable to decide whether to display it or not. An example entry for $filter would be "office" or "164(.*)976".

I would like to know whether there is a simple way to say: if $filter does not match in $node. In the form of a regular expression?

So... not an "if(!preg_match" but more of a $filter = "!office" or "!164(.*)976" but one that works?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

酷炫老祖宗 2024-11-09 05:36:26

如果您确实想使用“负正则表达式”而不是简单地反转正正则表达式的结果,则可以这样做:

if(preg_match("/^(?:(?!" . $filter . ").)*$/i", $node)) {
    echo $node;
}

如果字符串不包含 $filter 中的正则表达式/子字符串,则将匹配字符串。

说明:(以 office 作为示例字符串)

^          # Anchor the match at the start of the string
(?:        # Try to match the following:
 (?!       # (unless it's possible to match
  office   # the text "office" at this point)
 )         # (end of negative lookahead),
 .         # Any character
)*         # zero or more times
$          # until the end of the string

This can be done if you definitely want to use a "negative regex" instead of simply inverting the result of the positive regex:

if(preg_match("/^(?:(?!" . $filter . ").)*$/i", $node)) {
    echo $node;
}

will match a string if it doesn't contain the regex/substring in $filter.

Explanation: (taking office as our example string)

^          # Anchor the match at the start of the string
(?:        # Try to match the following:
 (?!       # (unless it's possible to match
  office   # the text "office" at this point)
 )         # (end of negative lookahead),
 .         # Any character
)*         # zero or more times
$          # until the end of the string
小嗷兮 2024-11-09 05:36:26

(?!...) 否定断言 是什么你正在寻找。

要排除某个字符串出现在主题中的任何位置,您可以使用此双重断言方法:

preg_match('/(?=^((?!not_this).)+$)  (......)/xs', $string);

它仍然允许指定任意(......)主正则表达式。但如果您只想禁止字符串,则可以将其省略。

The (?!...) negative assertion is what you're looking for.

To exclude a certain string from appearing anywhere in the subject you can use this double assertion method:

preg_match('/(?=^((?!not_this).)+$)  (......)/xs', $string);

It allows to specify an arbitrary (......) main regex still. But you could just leave that out, if you only want to forbid a string.

铁轨上的流浪者 2024-11-09 05:36:26

马里奥的答案 2 是正确答案,原因如下:

首先回答贾斯汀·摩根的评论,

我很好奇,你知道它的性能如何吗?
与 !preg_match() 方法相反?我不在一个地方
我可以测试它们。 – 贾斯汀·摩根 2011 年 4 月 19 日 21:53

考虑一下门逻辑。

何时否定 preg_match():当查找匹配项时,您希望条件为 1) true(表示不存在所需的正则表达式),或 2) false(表示存在正则表达式)。

何时在正则表达式上使用否定断言:在查找匹配项时,如果字符串仅与正则表达式匹配,则您希望条件为 true,如果找到其他内容,则条件失败。如果您确实需要测试不需要的字符,同时允许省略允许的字符,那么这是必要的。

否定 (preg_match() === 1) 的结果仅测试正则表达式是否存在。如果需要“bar”,并且不允许使用数字,则以下内容将不起作用:

if (preg_match('bar', 'foo2bar') === 1) {
  echo "found 'bar'"; // but a number is here, so fail.
}

if (!pregmatch('[0-9]', 'foobar') === 1) {
  echo "no numbers found"; // but didn't test for 'bar', so fail.
}

因此,为了真正测试多个正则表达式,初学者将使用多个 preg_match() 调用进行测试...我们知道这是一个非常业余的方式来做这件事。

因此,Op 想要测试字符串中可能的正则表达式,但只有当字符串至少包含其中一个时,条件才可能作为 true 传递。对于大多数简单的情况,简单地否定 preg_match() 就足够了,但对于更复杂或更广泛的正则表达式模式,则不行。我将把我的情况用于更现实的场景:

假设您想要一个包含人名(尤其是姓氏)的用户表单。您希望系统接受所有字母(无论大小写和位置)、接受连字符、接受撇号并排除所有其他字符。我们知道,为所有不需要的字符匹配正则表达式是我们首先想到的,但想象一下您支持 UTF-8...那是很多字符!您的程序将几乎与 UTF-8 表一样大,仅在一行上!我不在乎你有什么硬件,你的服务器应用程序对命令的长度有有限的限制,更不用说 200 个带括号的子模式的限制,所以整个 UTF-8 字符表(减去 [AZ],[az ]、-、和 ') 太长了,别介意程序本身会很大!

由于我们不会在字符串上使用 if (!preg_match('.#\\$\%... 这可能会很长并且无法评估...来查看字符串是否错误,因此我们应该测试更简单的方法是,在正则表达式上使用断言否定环视,然后使用以下方法否定整体结果:

<?php
  $string = "O'Reilly-Finlay";
  if (preg_match('/?![a-z\'-]/i', $string) === 0) {
    echo "the given string matched exclusively for regex pattern";
    // should not work on error, since preg_match returns false, which is not an int (we tested for identity, not equality)
  } else {
    echo "the given string did not match exclusively to the regex pattern";
  }
?>

如果我们只查找正则表达式 [az\'-]/i ,我们所说的就是“匹配字符串,如果它包含任何这些东西”,因此不会测试坏字符。如果我们对函数进行否定,我们会说“如果我们找到包含这些内容的匹配项,则返回 false”。这也是不对的,因此我们需要说“返回 false”如果我们匹因此,它会检查正则表达式的第一个字符,如果匹配,则会继续,直到找到不匹配或结束后,所有发现的与正则表达式不匹配的内容都会返回。匹配数组,或者简单地返回 1。简而言之,在正则表达式“a”上断言负数与匹配正则表达式“b”相反,其中“b”包含“a”无法匹配的所有其他内容。非常适合当“b”过于广泛时。

注意:如果我的正则表达式有错误,我很抱歉...过去几个月我一直在使用 Lua,所以我可能会混合我的正则表达式规则。否则,“?!”是 PHP 的正确前瞻语法。

Answer number 2 by mario is the correct answer, and here is why:

First to answer the comment by Justin Morgan,

I'm curious, do you have any idea what the performance of this would
be as opposed to the !preg_match() approach? I'm not in a place where
I can test them both. – Justin Morgan Apr 19 '11 at 21:53

Consider the gate logic for a moment.

When to negate preg_match(): when looking for a match and you want the condition to be 1)true for the absence of the desired regex, or 2)false for the regex being present.

When to use negative assertion on the regex: when looking for a match and you want the condition to be true if the string ONLY matches the regex, and fail if anything else is found. This is necessary if you really need to test for undesireable characters while allowing ommission of permitted characters.

Negating the result of (preg_match() === 1) only tests if the regex is present. If 'bar' is required, and numbers aren't allowed, the following won't work:

if (preg_match('bar', 'foo2bar') === 1) {
  echo "found 'bar'"; // but a number is here, so fail.
}

if (!pregmatch('[0-9]', 'foobar') === 1) {
  echo "no numbers found"; // but didn't test for 'bar', so fail.
}

So, in order to really test multiple regexes, a beginner would test using multiple preg_match() calls... we know this is a very amateur way to do it.

So, the Op wants to test a string for possible regexes, but the conditional may only pass as true if the string contains at least one of them. For most simple cases, simply negating preg_match() will suffice, but for more complex or extensive regex patterns, it won't. I will use my situation for a more real-life scenario:

Say you want to have a user form for a person's name, particularly a last name. You want your system to accept all letters regardless of case and placement, accept hyphens, accept apostrophes, and exclude all other characters. We know that matching a regex for all undesired characters is the first thing we think of, but imagine you are supporting UTF-8... that's alot of characters! Your program will be nearly as big as the UTF-8 table just on a single line! I don't care what hardware you have, your server application has a finite limit on how long a command be, not to mention the limit of 200 parenthesized subpatterns, so the ENTIRE UTF-8 character table (minus [A-Z],[a-z],-,and ') is too long, never mind that the program itself will be HUGE!

Since we won't use an if (!preg_match('.#\\$\%... this can be quite long and impossible to evaluate... on a string to see if the string is bad, we should instead test the easier way, with an assertion negative lookaround on the regex, then negate the overall result using:

<?php
  $string = "O'Reilly-Finlay";
  if (preg_match('/?![a-z\'-]/i', $string) === 0) {
    echo "the given string matched exclusively for regex pattern";
    // should not work on error, since preg_match returns false, which is not an int (we tested for identity, not equality)
  } else {
    echo "the given string did not match exclusively to the regex pattern";
  }
?>

If we only looked for the regex [a-z\'-]/i , all we say is "match string if it contains ANY of those things", so bad characters aren't tested. If we negated at the function, we say "return false if we find a match that contained any of these things". This isn't right either, so we need to say "return false if we match ANYTHING not in the regex", which is done with lookahead. I know the bells are going off in someone's head, and they are thinking wildcard expansion style... no, lookahead doesn't do this, it just does negation on each match, and continues. So, it checks first character for regex, if it matches, it moves on until it finds a non-match or the end. After it finishes, everything that was found to not match the regex is returned to the match array, or simply returns 1. In short, assert negative on regex 'a' is the opposite of matching regex 'b', where 'b' contains EVERYTHING ELSE not matchable by 'a'. Great for when 'b' would be ungodly extensive.

Note: if my regex has an error in it, I apologize... I have been using Lua for the last few months, so I may be mixing my regex rules. Otherwise, the '?!' is proper lookahead syntax for PHP.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文