JavaScript 垃圾邮件词过滤器
您好,
我正在尝试使用 Javascript 编写一个简单的垃圾邮件单词过滤器,该过滤器循环遍历单词数组并尝试匹配作为字符串传入的整个单词。
下面是我到目前为止所拥有的,它的工作原理只是进行部分单词匹配而不是匹配整个单词。
因此,在下面的示例中,传入的字符串如下:
We are Offer Great Education Classes and Many CE Credits all Year!
匹配单词“credit”
我正在寻找匹配整个单词而不是部分单词匹配的方法。
任何帮助将不胜感激。
var spam_words_arr=new Array(
"loan",
"winning",
"bulk email",
"mortgage",
"free",
"save",
"credit",
"amazing",
"bulk",
"email",
"opportunity",
"please read",
"reverses aging",
"hidden assets",
"stop snoring",
"free investment",
"dig up dirt on friends",
"stock disclaimer statement",
"multi level marketing",
"compare rates",
"cable converter",
"claims you can be removed from the list",
"removes wrinkles",
"compete for your business",
"free installation",
"free grant money",
"auto email removal",
"collect child support",
"free leads",
"amazing stuff",
"tells you it's an ad",
"cash bonus",
"promise you",
"claims to be in accordance with some spam law",
"search engine listings",
"free preview",
"act now! don't hesitate",
"credit bureaus",
"no investment",
"obligation",
"guarantee",
"refinance",
"price",
"affordable",
"home loan",
"lower your monthly payments",
"new low rate",
"Your Mortgage",
"Your refi",
"serious cash");
function SubChecker() {
var sSubject = document.form1.subject.value;
reset_alert_count();
var alert_title = "The following words and phrases are not recommended in subject lines";
var compare_text;
eval('compare_text=sSubject;');
for(var j=0; j<spam_words_arr.length; j++) {
for(var k=0; k<(compare_text.length); k++) {
if(spam_words_arr[j]==compare_text.substring(k,(k+spam_words_arr[j].length)).toLowerCase()) {
spam_alert_arr[spam_alert_count]=compare_text.substring(k,(k+spam_words_arr[j].length));
spam_alert_count++;
}
}
}
for(var k=1; k<=spam_alert_count; k++) {
alert_text+= "<br> <li> "+ spam_alert_arr[k-1];
eval('compare_text=document.form1.subject.focus();');
eval('compare_text=document.form1.subject.select();');
}
}
好的,这是我的修订版,但我无法运行代码。有人可以看一下并给我一些建议吗?
提前致谢。
function SubChecker() {
var sSubject = document.form1.subject.value;
reset_alert_count();
var alert_title = "The following words and phrases are not recommended in subject lines";
for(var j=0; j<spam_words_arr.length; j++) {
for(var k=0; k<(sSubject.length); k++) {
var rExp = new RegExp("("+spam_words_arr[j]+")", "ig");
alert(rExp);
if(rExp.match(sSubject)){
spam_alert_count++;
}
}
for(var k=1; k<=spam_alert_count; k++) {
alert_text+= "<br> <li> "+ spam_alert_arr[k-1];
}
enter code here
HI
I am trying to use Javascript to write a simple SPAM word filter that loops through an array of words and tries to match the whole word that is a passed in as string.
Below is what I have so far and it works except that is does partial word matching instead of matching the whole word.
So in my example below the string passed in below:
We are Offering Great Education Classes and Many CE Credits all Year Long!
Matched the word "credit"
I am looking for a way to match the whole word and not a partial word match.
Any help would be appreciated.
var spam_words_arr=new Array(
"loan",
"winning",
"bulk email",
"mortgage",
"free",
"save",
"credit",
"amazing",
"bulk",
"email",
"opportunity",
"please read",
"reverses aging",
"hidden assets",
"stop snoring",
"free investment",
"dig up dirt on friends",
"stock disclaimer statement",
"multi level marketing",
"compare rates",
"cable converter",
"claims you can be removed from the list",
"removes wrinkles",
"compete for your business",
"free installation",
"free grant money",
"auto email removal",
"collect child support",
"free leads",
"amazing stuff",
"tells you it's an ad",
"cash bonus",
"promise you",
"claims to be in accordance with some spam law",
"search engine listings",
"free preview",
"act now! don't hesitate",
"credit bureaus",
"no investment",
"obligation",
"guarantee",
"refinance",
"price",
"affordable",
"home loan",
"lower your monthly payments",
"new low rate",
"Your Mortgage",
"Your refi",
"serious cash");
function SubChecker() {
var sSubject = document.form1.subject.value;
reset_alert_count();
var alert_title = "The following words and phrases are not recommended in subject lines";
var compare_text;
eval('compare_text=sSubject;');
for(var j=0; j<spam_words_arr.length; j++) {
for(var k=0; k<(compare_text.length); k++) {
if(spam_words_arr[j]==compare_text.substring(k,(k+spam_words_arr[j].length)).toLowerCase()) {
spam_alert_arr[spam_alert_count]=compare_text.substring(k,(k+spam_words_arr[j].length));
spam_alert_count++;
}
}
}
for(var k=1; k<=spam_alert_count; k++) {
alert_text+= "<br> <li> "+ spam_alert_arr[k-1];
eval('compare_text=document.form1.subject.focus();');
eval('compare_text=document.form1.subject.select();');
}
}
OK Here is my revision but I cannot get the code to run. Can someone take a look and give me hand with some suggestions.
Thanks in advance.
function SubChecker() {
var sSubject = document.form1.subject.value;
reset_alert_count();
var alert_title = "The following words and phrases are not recommended in subject lines";
for(var j=0; j<spam_words_arr.length; j++) {
for(var k=0; k<(sSubject.length); k++) {
var rExp = new RegExp("("+spam_words_arr[j]+")", "ig");
alert(rExp);
if(rExp.match(sSubject)){
spam_alert_count++;
}
}
for(var k=1; k<=spam_alert_count; k++) {
alert_text+= "<br> <li> "+ spam_alert_arr[k-1];
}
enter code here
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以将“单词”数组设置为正则表达式数组,和
\b
字边界标记。例如:...然后使用
exec
< /a> 或test
函数在正则表达式上做测试。事实上,你的数组可能会成为一个巨大的交替,两端都有
\b
:(我显然已经将数组的大部分内容排除在外。)在 JavaScript 正则表达式中,像
a|b
这样的交替表示“匹配a
或b
。为此使用正则表达式的另一个优点是您可以更加灵活而不是所有可疑单词的暴力列表。
离题:
为了初始化数组,我建议使用数组文字表示法,而不是您使用的构造函数调用,例如:
它更短,它不会与重新定义
Array
的人发生冲突,并且您不会对var x = new Array(5);
有什么歧义应该意味着(创建一个包含五个空白点的数组,而不是一个包含 5 个条目的数组)。eval 的这些用法……很奇怪,因为它们看起来完全没有必要。很少有需要使用
eval
的用例(我已经成功地进行了几年的 JavaScript 编码,但从未在生产代码中使用过它)。如果您发现自己正在编写eval
,建议您在 StackOverflow 上发布一个问题,其中只包含您认为需要它的代码以及原因,这里的人员将为您提供更好的选择。You could make your array of "words" an array of regular expressions, and the the
\b
word boundary marker. E.g.:...then use the
exec
ortest
functions on the regular expression to do the test.In fact, your array could become one massive alternation with
\b
on either end:(I've obviously left most of the array out.) In a JavaScript regular expression, an alternation like
a|b
means "matcha
orb
.Another advantage to using a regular expression for this is that you can be more flexible than a brute-force list of all suspect words.
Off-topic:
For initializing an array, I'd recommend array literal notation rather than the constructor call you've used, e.g.:
It's shorter, it can't run afoul of someone redefining
Array
, and you don't have the ambiguity of whatvar x = new Array(5);
should mean (that creates an array with five blank spots, rather than an array with one entry containing5
).Those uses of
eval
are...odd as they seem completely unnecessary. There are very, very few use-cases whereeval
is necessary (I've managed to do several years of JavaScript coding without ever using it in production code). If you find yourself writingeval
, recommend posting a question here on StackOverflow with just the bit of code you think you need it for, and why, and the folks here will give you a better alternative.