JavaScript 垃圾邮件词过滤器

发布于 2024-11-03 12:16:32 字数 3029 浏览 2 评论 0原文

您好,

我正在尝试使用 Javascript 编写一个简单的垃圾邮件单词过滤器,该过滤器循环遍历单词数组并尝试匹配作为字符串传入的整个单词。

下面是我到目前为止所拥有的,它的工作原理只是进行部分单词匹配而不是匹配整个单词。

因此,在下面的示例中,传入的字符串如下:

We are Offer Great Education Classes and Many CE Credits all Year!

匹配单词“credit

我正在寻找匹配整个单词而不是部分单词匹配的方法。

任何帮助将不胜感激。

    var spam_words_arr=new Array(
"loan",
"winning",
"bulk email",
"mortgage",
"free",
"save",
"credit",
"amazing",
"bulk",
"email",
"opportunity",
"please read",
"reverses aging",
"hidden assets",
"stop snoring",
"free investment",
"dig up dirt on friends",
"stock disclaimer statement",
"multi level marketing",
"compare rates",
"cable converter",
"claims you can be removed from the list",
"removes wrinkles",
"compete for your business",
"free installation",
"free grant money",
"auto email removal",
"collect child support",
"free leads",
"amazing stuff",
"tells you it's an ad",
"cash bonus",
"promise you",
"claims to be in accordance with some spam law",
"search engine listings",
"free preview",
"act now! don't hesitate",
"credit bureaus",
"no investment",
"obligation",
"guarantee",
"refinance",
"price",
"affordable",
"home loan",
"lower your monthly payments",
"new low rate",
"Your Mortgage",
"Your refi",
"serious cash"); 



 function SubChecker() { 
    var sSubject = document.form1.subject.value;
    reset_alert_count();
    var alert_title = "The following words and phrases are not recommended in subject lines";
    var compare_text; 

        eval('compare_text=sSubject;'); 
            for(var j=0; j<spam_words_arr.length; j++) { 
                for(var k=0; k<(compare_text.length); k++) { 
                    if(spam_words_arr[j]==compare_text.substring(k,(k+spam_words_arr[j].length)).toLowerCase()) {
                        spam_alert_arr[spam_alert_count]=compare_text.substring(k,(k+spam_words_arr[j].length)); 
                        spam_alert_count++; 
                    } 
                } 
        } 
        for(var k=1; k<=spam_alert_count; k++) { 
            alert_text+= "<br> <li> "+ spam_alert_arr[k-1]; 
            eval('compare_text=document.form1.subject.focus();'); 
            eval('compare_text=document.form1.subject.select();'); 
        } 

    } 

好的,这是我的修订版,但我无法运行代码。有人可以看一下并给我一些建议吗?

提前致谢。

function SubChecker() { 
var sSubject = document.form1.subject.value;
reset_alert_count();
var alert_title = "The following words and phrases are not recommended in subject lines";


    for(var j=0; j<spam_words_arr.length; j++) {
            for(var k=0; k<(sSubject.length); k++) {
                var rExp = new RegExp("("+spam_words_arr[j]+")", "ig");
                alert(rExp);
                if(rExp.match(sSubject)){
                    spam_alert_count++;
                }
    }
    for(var k=1; k<=spam_alert_count; k++) {
        alert_text+= "<br> <li> "+ spam_alert_arr[k-1];

    }



enter code here

HI

I am trying to use Javascript to write a simple SPAM word filter that loops through an array of words and tries to match the whole word that is a passed in as string.

Below is what I have so far and it works except that is does partial word matching instead of matching the whole word.

So in my example below the string passed in below:

We are Offering Great Education Classes and Many CE Credits all Year Long!

Matched the word "credit"

I am looking for a way to match the whole word and not a partial word match.

Any help would be appreciated.

    var spam_words_arr=new Array(
"loan",
"winning",
"bulk email",
"mortgage",
"free",
"save",
"credit",
"amazing",
"bulk",
"email",
"opportunity",
"please read",
"reverses aging",
"hidden assets",
"stop snoring",
"free investment",
"dig up dirt on friends",
"stock disclaimer statement",
"multi level marketing",
"compare rates",
"cable converter",
"claims you can be removed from the list",
"removes wrinkles",
"compete for your business",
"free installation",
"free grant money",
"auto email removal",
"collect child support",
"free leads",
"amazing stuff",
"tells you it's an ad",
"cash bonus",
"promise you",
"claims to be in accordance with some spam law",
"search engine listings",
"free preview",
"act now! don't hesitate",
"credit bureaus",
"no investment",
"obligation",
"guarantee",
"refinance",
"price",
"affordable",
"home loan",
"lower your monthly payments",
"new low rate",
"Your Mortgage",
"Your refi",
"serious cash"); 



 function SubChecker() { 
    var sSubject = document.form1.subject.value;
    reset_alert_count();
    var alert_title = "The following words and phrases are not recommended in subject lines";
    var compare_text; 

        eval('compare_text=sSubject;'); 
            for(var j=0; j<spam_words_arr.length; j++) { 
                for(var k=0; k<(compare_text.length); k++) { 
                    if(spam_words_arr[j]==compare_text.substring(k,(k+spam_words_arr[j].length)).toLowerCase()) {
                        spam_alert_arr[spam_alert_count]=compare_text.substring(k,(k+spam_words_arr[j].length)); 
                        spam_alert_count++; 
                    } 
                } 
        } 
        for(var k=1; k<=spam_alert_count; k++) { 
            alert_text+= "<br> <li> "+ spam_alert_arr[k-1]; 
            eval('compare_text=document.form1.subject.focus();'); 
            eval('compare_text=document.form1.subject.select();'); 
        } 

    } 

OK Here is my revision but I cannot get the code to run. Can someone take a look and give me hand with some suggestions.

Thanks in advance.

function SubChecker() { 
var sSubject = document.form1.subject.value;
reset_alert_count();
var alert_title = "The following words and phrases are not recommended in subject lines";


    for(var j=0; j<spam_words_arr.length; j++) {
            for(var k=0; k<(sSubject.length); k++) {
                var rExp = new RegExp("("+spam_words_arr[j]+")", "ig");
                alert(rExp);
                if(rExp.match(sSubject)){
                    spam_alert_count++;
                }
    }
    for(var k=1; k<=spam_alert_count; k++) {
        alert_text+= "<br> <li> "+ spam_alert_arr[k-1];

    }



enter code here

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

灯角 2024-11-10 12:16:32

您可以将“单词”数组设置为正则表达式数组,和 \b 字边界标记。例如:

var spam_words_arr=new Array(
    /\bloan\b/i,
    ...
);

...然后使用 exec< /a> 或 test 函数在正则表达式上做测试。

事实上,你的数组可能会成为一个巨大的交替,两端都有 \b:(

var regex = /\b(?:loan|winning|bulk email|mortgage|free)\b/i;

我显然已经将数组的大部分内容排除在外。)在 JavaScript 正则表达式中,像 a|b 这样的交替表示“匹配 ab

为此使用正则表达式的另一个优点是您可以更加灵活而不是所有可疑单词的暴力列表。


离题

  1. 为了初始化数组,我建议使用数组文字表示法,而不是您使用的构造函数调用,例如:

    var spam_words_array = [
        入口,
        入口,
        入口,
        // ...
    ];
    

    它更短,它不会与重新定义Array的人发生冲突,并且您不会对var x = new Array(5);有什么歧义应该意味着(创建一个包含五个空白点的数组,而不是一个包含 5 个条目的数组)。

  2. eval 的这些用法……很奇怪,因为它们看起来完全没有必要。很少有需要使用 eval 的用例(我已经成功地进行了几年的 JavaScript 编码,但从未在生产代码中使用过它)。如果您发现自己正在编写 eval,建议您在 StackOverflow 上发布一个问题,其中只包含您认为需要它的代码以及原因,这里的人员将为您提供更好的选择。

You could make your array of "words" an array of regular expressions, and the the \b word boundary marker. E.g.:

var spam_words_arr=new Array(
    /\bloan\b/i,
    ...
);

...then use the exec or test functions on the regular expression to do the test.

In fact, your array could become one massive alternation with \b on either end:

var regex = /\b(?:loan|winning|bulk email|mortgage|free)\b/i;

(I've obviously left most of the array out.) In a JavaScript regular expression, an alternation like a|b means "match a or b.

Another advantage to using a regular expression for this is that you can be more flexible than a brute-force list of all suspect words.


Off-topic:

  1. For initializing an array, I'd recommend array literal notation rather than the constructor call you've used, e.g.:

    var spam_words_array = [
        entry,
        entry,
        entry,
        // ...
    ];
    

    It's shorter, it can't run afoul of someone redefining Array, and you don't have the ambiguity of what var x = new Array(5); should mean (that creates an array with five blank spots, rather than an array with one entry containing 5).

  2. Those uses of eval are...odd as they seem completely unnecessary. There are very, very few use-cases where eval is necessary (I've managed to do several years of JavaScript coding without ever using it in production code). If you find yourself writing eval, recommend posting a question here on StackOverflow with just the bit of code you think you need it for, and why, and the folks here will give you a better alternative.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文