未检测到正则表达式 preg_quote 符号
我在数据库中有一本脏话字典,下面的内容很好
preg_match_all("/\b".$f."(?:ing|er|es|s)?\b/si",$t,$m,PREG_SET_ORDER);
$t
是输入文本,简单地说, $f = preg_quote("punk")
; "punk"
来自数据库字典,因此此时循环中的表达式如下
preg_match_all("/\bpunk(?:ing|er|es|s)?\b/si",$t,$m,PREG_SET_ORDER);
preg_quote
替换符号,例如。 #
与 \\#
以便表达式被转义,但是当字典检查时,例如。 "F@CK"
或 "A$$"
在使用上述表达式的输入字符串中未检测到这些符号,我有 a$$
code> 和 f@ck
在字典中,但它们不起作用。如果我删除单词上的 preg_quote()
,则正则表达式无效,因为这些符号未转义。
关于如何检测 "a$$"
有什么建议吗???
编辑:
所以我猜想未按预期工作的表达式是例如。
preg_match_all("/\bf\@ck(?:ing|er|es|s)?\b/si",$t,$m,PREG_SET_ORDER);
应该在 $t
中找到 f@ck
更新:
这是我的用法,简单地说;如果 $m
中有匹配项,请将其替换为 "\*\*\*\*"
,整个块位于字典中每个单词的循环内,< code>$f 是字典单词,$t
是输入
$f = preg_quote($f);
preg_match_all("/\b$f(?:ing|er|es|s)?\b/si",$t,$m,PREG_SET_ORDER);
if (count($m) > 0) {
$t = preg_replace("/(\b$f(?:ing|er|es|s)?\b)/si","\*\*\*\*\*",$t);
}
UPDATE: 看哪,var_dump
:
preg_quote($f) = string(5) "a\$\$"
$t = string(18) "You're such an a$$"
expression = string(29) "/\ba\$\$(?:ing|er|es|s)?\b/si"
更新: 仅当单词以符号结尾时才会发生这种情况。我测试了 "a$$hole"
没问题,但 "a$$"
不起作用。
另一个更新: 尝试这个简化版本,$words
是一个临时字典,
$words = array("a$$","asshole","a$$hole","f@ck","f#ck","f*ck");
$text = "Input whatever you feel like here eg. a$$";
foreach ($words as $f) {
$f = preg_quote($f,"/");
$text = preg_replace("/\b".$f."(?:ing|er|es|s)?\b/si",
str_repeat("*",strlen($f)),
$t);
}
我应该期望看到“在这里输入你想要的任何内容,例如\*\*\*”
作为结果。
I have a dictionary of swear words in the database, and the following works great
preg_match_all("/\b".$f."(?:ing|er|es|s)?\b/si",$t,$m,PREG_SET_ORDER);
$t
is the input text and simply, $f = preg_quote("punk")
; "punk"
is from the database dictionary, so at this point in the loop the expression is as follows
preg_match_all("/\bpunk(?:ing|er|es|s)?\b/si",$t,$m,PREG_SET_ORDER);
preg_quote
replaces symbols eg. #
with \\#
so that the expression is escaped, but when the dictionary is checking eg. "F@CK"
or "A$$"
these symbols are not detected in the input string with the above expression, I have both a$$
and f@ck
in the dictionary, but they do not work. If I remove preg_quote()
on the word, the regular expression is invalid as these symbols are not escaped.
Any suggestions on how I can detect "a$$"
???
Edit:
So I guess the expression that is not working as intended would be eg.
preg_match_all("/\bf\@ck(?:ing|er|es|s)?\b/si",$t,$m,PREG_SET_ORDER);
Which should find f@ck in $t
UPDATE:
This is my usage, simply put; if there are matches in $m
replace them with "\*\*\*\*"
, this whole block is inside a loop through each word in the dictionary, $f
is the dictionary word and $t
is the input
$f = preg_quote($f);
preg_match_all("/\b$f(?:ing|er|es|s)?\b/si",$t,$m,PREG_SET_ORDER);
if (count($m) > 0) {
$t = preg_replace("/(\b$f(?:ing|er|es|s)?\b)/si","\*\*\*\*\*",$t);
}
UPDATE:
Behold, the var_dump
:
preg_quote($f) = string(5) "a\$\$"
$t = string(18) "You're such an a$"
expression = string(29) "/\ba\$\$(?:ing|er|es|s)?\b/si"
UPDATE:
This is only happening when words end with a symbol. I tested "a$$hole"
and it’s fine, but "a$$"
doesn't work.
ANOTHER UPDATE:
Try this simplified version, $words
being a make-shift dictionary
$words = array("a$","asshole","a$hole","f@ck","f#ck","f*ck");
$text = "Input whatever you feel like here eg. a$";
foreach ($words as $f) {
$f = preg_quote($f,"/");
$text = preg_replace("/\b".$f."(?:ing|er|es|s)?\b/si",
str_repeat("*",strlen($f)),
$t);
}
I should expect to see "Input whatever you feel like here eg. \*\*\*"
as a result.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
无法完成
抱歉,这个“问题”确实无法解决。考虑这些:
Cannot Be Done
I'm sorry, but this “problem” is truly impossible to solve. Consider these:
It Gets Worse
And if you think those are easy, just try coping with all of these:
And that’s not all: there are at least a bazingatillion more where those came from. Do you see now why this fundamentally cannot be done?
Full Disclosure
Because I don't believe in security through obscurity, here's the program that generates all those:
\b
检查字边界。根据 http://www.regular-expressions.info/wordboundaries.html:“单词字符”是字母、数字和下划线,因此在字符串“a$$”中,单词边界出现在“a”之后,而不是第二个“$”之后。
您可能需要使用类(例如,
[- '"]
)显式指定您认为是“单词边界”的字符。\b
checks for a word boundary. According to http://www.regular-expressions.info/wordboundaries.html:"Word characters" are letters, digits, and underscores, so in the string "a$$", the word boundary occurs after the "a", not after the second "$".
You will probably need to explicitly specify the characters you consider to be "word boundaries" by using a class (e.g.,
[- '"]
).现在,当你说它在单词末尾不起作用时,我看到了问题。
$@
或任何其他此类特殊字符不是单词的一部分(因此\b
在 'a$$' 的情况下会中断 'a' 之后的单词,如果输入字符串中后面没有任何其他字母)。我建议使用[^az]
标记单词结尾来修复它。Now, when you said that it doesn't work at the end of the word I see the problem.
$@
or any other such special characters aren't part of the word (so\b
breaks the word after 'a' in case of 'a$$' if it isn't followed by any other letters in the input string). I suggest using[^a-z]
to mark the end of the word to fix it.