The problem is what do you class as normal and spam words?
但问题是,您应该如何划分正常单词和垃圾邮件单词呢?
If you encounter a word or phrase that is not in the Spam Words list, you can add it.
如果你遇到了一个词或者短语不在垃圾广告词列表上,你可以添加上去。
If almost every comment anyone submits ends up in your moderation queue, there's probably a problem with your spam words list.
如果任何人递交的评论都出现在你的审核队列中,这可能是你的垃圾广告词列表方面的问题。
In "A Plan for Spam" (see Resources later in this article), Graham suggested building Bayesian probability models of spam and non-spam words.
在“APlanforSpam”(请参阅本文后面的 参考资料)中,Graham提议建立垃圾邮件和非垃圾邮件单词的贝叶斯概率模型。
The principle is very simple, but it's also very effective as a lot of spam contains the same words and sometimes the same repetition of words.
这个原则非常简单,但是也非常有效,因为大量垃圾邮件都包含相同的单词,并且有时会包含单词的相同的重复。
The general idea is that some words occur more frequently in known spam, and other words occur more frequently in legitimate messages.
其大体思想是,在已知的垃圾邮件中,一些单词出现的频率较高,而在合法消息中,另一些单词出现的频率较高。
The Bayesian spam filtering technique works by comparing "spam" words with "normal" words.
Bayesian垃圾邮件过滤技术通过将“垃圾邮件”单词与“正常”单词进行比较来实现过滤任务。
In other words, a genuine and trusted E-mail address gets identified as a spam E-mail address, which can skew your results.
换句话说,将真实的、受信任的电子邮件地址标识为垃圾电子邮件地址,这可能会影响您的结果。
No spam posts of less than 10 words.
少于10个单词为垃圾帖。
You also want to avoid using the words: hello, hi, help, new or the recipient's name ore-mail address as doing so can trigger spam filters.
您也想避免用留言:你好,喜、帮助新增或收件人的姓名或电子邮件地址作为这样才能引发垃圾邮件过滤。
When using the traditional bayes method to filter spam, it treats email as a vector space out of order, deserting the relation of the words and the sentences.
传统的贝叶斯方法对邮件进行过滤时,将邮件视为一个无序关键词的向量空间。丢掉了词与词之间,句子之间的相互关系。
'When a comment contains any of these words in its content, name, URL, E-mail, or IP, it will be marked as spam.
当评论内容中包含以下任何的词的时候,名称,URL,电子邮件,或者IP,评论就会拖去审核。
'When a comment contains any of these words in its content, name, URL, E-mail, or IP, it will be marked as spam.
当评论内容中包含以下任何的词的时候,名称,URL,电子邮件,或者IP,评论就会拖去审核。
应用推荐