运用一些众所周知的数学知识,对于每个单词,可以生成一个“垃圾邮件指示性概率”。
Using well-known mathematics, it is possible to generate a "spam-indicative probability" for each word.
在“APlanforSpam”(请参阅本文后面的 参考资料)中,Graham提议建立垃圾邮件和非垃圾邮件单词的贝叶斯概率模型。
In "A Plan for Spam" (see Resources later in this article), Graham suggested building Bayesian probability models of spam and non-spam words.
从它推导出的功能组合出现在垃圾邮件中的概率高。
From that it is deduced which combination of features appears with high probability in spam messages.
根据邮件特征出现在垃圾邮件和非垃圾邮件中概率不同,提出了特征对邮件分类贡献度的概念,并给出了其计算公式。
We advance a concept of degree-of-contribution(DC) witch measure a feature's contributions to Spam Filtering, based on the probability of it appears in Spam and Ham.
根据邮件特征出现在垃圾邮件和非垃圾邮件中概率不同,提出了特征对邮件分类贡献度的概念,并给出了其计算公式。
We advance a concept of degree-of-contribution(DC) witch measure a feature's contributions to Spam Filtering, based on the probability of it appears in Spam and Ham.
应用推荐