PHP preg_match 使用正则表达式 - 字过滤器

发布于 2024-11-15 00:58:51 字数 1026 浏览 2 评论 0原文

大家好

我正在尝试使用preg_match来识别是否在文本字符串中找到单个单词。如果单词中每个字符有多个实例(按正确的顺序),则需要选取该单词。为了让自己的生活变得困难,即使客户试图通过在我希望匹配的单词中输入某些字符来“欺骗” preg_match,我也想拾取这个单词。

它用于脏话过滤器,如果找到“dave”,我会将其替换为其他内容。我试图想出完美的正则表达式,但运气不佳。请参阅以下示例以及我迄今为止发现的问题(我使用 3 作为客户端可以用来“欺骗”支票的示例字符)


使用:~\b(?:3+)?d+(?:3+)?a+(?:3+)?v+(?:3+)?e+(?:3+)?\b~ i

好的

  • 输入:dave = 通过
  • 输入:3d3a3v3e3 = 通过
  • 输入:ddddaaaavvvveeee = 通过
  • 输入:3ave = 失败

否好的

  • 输入:dd3ddaa3aa3vv3vvee3ee =失败(我希望通过)

使用:~\b[d3]+[a3]+[v3] +[e3]+\b~i

好的

  • 输入:dave = 通过
  • 输入:3d3a3v3e3 = 通过
  • 输入: ddddaaaavvvveeee = pass
  • 输入:dd3ddaa3aa3vv3vvee3ee = pass

不好

  • 输入:3ave = pass (我希望此操作失败)

感谢您提供的任何帮助正则表达式,非常感谢。

Hello All,

I am trying to use preg_match to identify if a single word found within a string of text. This word needs to be picked up if there are multiple instances of each character within the word (in the correct order). To make life hard for myself I also want to pick up on the word even if the client has tried to 'fool' the preg_match by means of entering certain characters within the word I wish to match.

It is for use in a swearword filter, if 'dave' is found I will replace it with something else. I have tried to come up with the perfect regular expression but I'm not having much luck. Please see the following examples and the issues I have found so far (I have used 3 as an example character the client could use to 'fool' the check);


Using: ~\b(?:3+)?d+(?:3+)?a+(?:3+)?v+(?:3+)?e+(?:3+)?\b~i

Okay

  • Input: dave = pass
  • Input: 3d3a3v3e3 = pass
  • Input: ddddaaaavvvveeee = pass
  • Input: 3ave = fail

Not Okay

  • Input: dd3ddaa3aa3vv3vvee3ee = fail (I want this to pass)

Using: ~\b[d3]+[a3]+[v3]+[e3]+\b~i

Okay

  • Input: dave = pass
  • Input: 3d3a3v3e3 = pass
  • Input: ddddaaaavvvveeee = pass
  • Input: dd3ddaa3aa3vv3vvee3ee = pass

Not Okay

  • Input: 3ave = pass (I want this to fail)

Thank you for any help on the regular expression, it's much appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

挽袖吟 2024-11-22 00:58:51

无需讨论它是否是一个好的亵渎过滤器(可能不是!),以下正则表达式将满足您的规范:

d.*a.*v.*e

如果“3”是唯一的“特殊”字符,则尝试以下操作:

d3*a3*v3*e

Without discussing if it's a good profanity filter (probably not!), the following regex will fulfill your spec:

d.*a.*v.*e

If '3' is the only 'special' character, then try this:

d3*a3*v3*e
欢烬 2024-11-22 00:58:51

这行不通。

例如,您的过滤器将阻止“firetruck”;)

有人也可以用 u 替换 v 或用 c 替换<

除了拥有大量已知单词及其拼写错误的白名单之外,我不知道是否有建立脏话过滤器的好方法。

也许您应该重新考虑为什么需要脏话过滤器。如果您的“客户”想要它,请让他们提供他们想要阻止的单词列表,这不是您的问题。

This wont work.

For instance, your filter is going to block "firetruck" ;)

Someone could also just substitute a u for a v or a c for a <

I don't know if there is a good way to build a profanity filter, other than to have a large white-list of known words and their misspellings.

Perhaps you should rethink why you want the profanity filter. If your 'customer' wants it, have them supply a list of words they want blocked, it's not your problem.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文