检测文本区域提交中的特定单词

发布于 2024-12-10 09:58:00 字数 357 浏览 1 评论 0原文

我的网站上有一个新功能,用户可以通过文本区域提交任何文本(我停止了所有 HTML 条目)。我仍然遇到的主要问题是他们可以输入“http://somewhere.com”,这是我想阻止的。我还想将特定单词列入黑名单。这是我之前的情况:

if (strpos($entry, "http://" or ".com" or ".net" or "www." or ".org" or ".co.uk" or "https://") !== true) {
            die ('Entries cannot contain links!');

但是这不起作用,因为它根本阻止用户提交任何文本。所以我的问题很简单,我该怎么做?

I have a new feature on my site, where users can submit any text (I stopped all HTML entries) via a textarea. The main problem I still have though is that they could type "http://somewhere.com" which is something I want to stop. I also want to blacklist specific words. This is what I had before:

if (strpos($entry, "http://" or ".com" or ".net" or "www." or ".org" or ".co.uk" or "https://") !== true) {
            die ('Entries cannot contain links!');

However that didn't work, as it stopped users from submitting any text at all. So my question is simple, how can I do it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

巷雨优美回忆 2024-12-17 09:58:00

这是正则表达式的工作。

你需要做的是这样的:

// A list of words you don't allow
$disallowedWords = array(
  'these',
  'words',
  'are',
  'not',
  'allowed'
);
// Search for disallowed words.
// The Regex used here should e.g. match 'are', but not match 'care' or 'stare'
foreach ($disallowedWords as $word) {
  if (preg_match("/\s+$word\s+/i", $entry)) {
    die("The word '$word' is not allowed...");
  }
}

// This variable should contain a regex that will match URLs
// there are thousands out there, take your pick. I have just
// used an arbitrary one I found with Google
$urlRegex = '(http|https|ftp)\://([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*@)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|localhost|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.(com|edu|gov|int|mil|net|org|biz|arpa|info|name|pro|aero|coop|museum|[a-zA-Z]{2}))(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&%\$#\=~_\-]+))*';

// Search for URLs
if (preg_match($urlRegex, $entry)) {
  die("URLs are not allowed...");
}

This is a job for Regular Expressions.

What you need to do it something like this:

// A list of words you don't allow
$disallowedWords = array(
  'these',
  'words',
  'are',
  'not',
  'allowed'
);
// Search for disallowed words.
// The Regex used here should e.g. match 'are', but not match 'care' or 'stare'
foreach ($disallowedWords as $word) {
  if (preg_match("/\s+$word\s+/i", $entry)) {
    die("The word '$word' is not allowed...");
  }
}

// This variable should contain a regex that will match URLs
// there are thousands out there, take your pick. I have just
// used an arbitrary one I found with Google
$urlRegex = '(http|https|ftp)\://([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*@)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|localhost|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.(com|edu|gov|int|mil|net|org|biz|arpa|info|name|pro|aero|coop|museum|[a-zA-Z]{2}))(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&%\$#\=~_\-]+))*';

// Search for URLs
if (preg_match($urlRegex, $entry)) {
  die("URLs are not allowed...");
}
分开我的手 2024-12-17 09:58:00

您必须多次使用 strpos。按照您的方式,您可以返回 true / false 来评估 or 语句并将其传递给 strpos。

这样它应该可以工作:

if (strpos($entry, "http://") !== false || strpos($entry, "https://") !== false || strpos($entry, ".com") !== false)

You must use strpos more the once. With your way you evaluate the or statement with returns true / false and pass it to strpos.

This way it should work:

if (strpos($entry, "http://") !== false || strpos($entry, "https://") !== false || strpos($entry, ".com") !== false)
乱世争霸 2024-12-17 09:58:00

一种简单的方法是将所有不允许的单词放入一个数组中,然后循环遍历它们以检查每个单词。

$banned = array('http://', '.com', '.net', 'www.', '.org'); // Add more
foreach ($banned as $word):
    if (strpos($entry, $word) !== false) die('Contains banned word');
endforeach;

这样做的问题是,如果您太得意忘形并开始禁止使用“com”一词或其他单词,那么其他包含字母“com”的单词和短语可能是完全合法的,这样会导致误报。您可以使用正则表达式来搜索看起来像 URL 的字符串,但随后您可以像我上面那样轻松地将它们分解。没有有效的方法可以完全阻止人们在评论中发布链接。如果你不希望它们在那里,你最终只能采取适度的态度。社区审核效果非常好,例如,请查看 Stack Overflow

A simple way to do this is to put all the words not allowed into an array and loop through them to check each one.

$banned = array('http://', '.com', '.net', 'www.', '.org'); // Add more
foreach ($banned as $word):
    if (strpos($entry, $word) !== false) die('Contains banned word');
endforeach;

The problem with this is if you get too carried away and start banning the word 'com' or something, there are other words and phrases that could be perfectly legal that contains the letters 'com' in that way that would cause a false positive. You could use regular expressions to search for strings that look like URLs, but then you can easily just break them up like I did above. There is no effective way to completely stop people from posting links into a comment. If you don't want them there, you'll ultimately just have to use moderation. Community moderation works very well, look at Stack Overflow for instance.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文