第二、建立文本分类模型,使用大量的有害信息样本数据训练分类模型。
Secondly, Building text categorization model, and training the model by a great many harmful information samples data.
利用训练文档的类信息对文本分类模型进行建模,提取对分类贡献较大的特征。
Use the class information of training set to build the model, and extract the feature benefit to classification.
给出一种基于多层前馈神经网络的中文文本分类模型,介绍了该模型的设计和实现。
This paper presents a text categorization model based on multilayered feedforward neutral network, and introduces the design and implementation of this model.
实验证明,改进后的文本分类模型适合于文本分类的需要,改善了原有分类器的性能。
Experiment shows that the revised text categorization model meets the need of text categorization, and improves the performance of former one.
实验结果表明基于信息熵的文本分类模型是一种比较稳定的算法,证明了算法的有效性。
The experimental results show that the performance of text categorization model based on entropy is a relatively stable algorithm, and prove the effectiveness of the algorithm.
为了提高文本分类的准确性,研究并设计了一个基于潜在语义分析和支持向量机的多类文本分类模型。
A multiclass text categorization model based on latent semantic analysis and support vector machine is researched and designed to enhance the accuracy of categorization.
最后在完成基于SVM多类别分类的理论研究的基础上,将其理论应用于实践,构建了一个了基于SVM的网络文本分类模型。
Finally, based on basic research of multi-class categorization of SVM, the text applies the theory to practice and analyzes a web text categorization model based on SVM network.
在此基础上,本文提出了一个层次式文本分类模型,然后将此模型应用到中文网页分类这一实际问题中,设计并实现了一个原型系统。
On this basis, we propose a hierarchical model of text classification, then this model is applied to the Chinese web page classification, and we design and implement a prototype system.
各种文本分类方法可以生成不同类型的 “模型”,即对真实环境的统计学描述。
Various text classification methodologies may yield different types of "models," or statistical descriptions of the world.
例如,NLTK有一个完整的框架,用于通过类似于“naiveBayesian”和“maximumentropy”等模型的统计技术进行文本分类。
For example, NLTK has a whole framework for text classification using statistical techniques like "naive Bayesian" and "maximum entropy" models.
基于拉推策略的基本思想,该文提出了文本分类的增量学习模型ICCDP。
Based on DragPush strategy, the paper introduces a text classification incremental learning model, named ICCDP.
介绍了一种基于模糊模式识别以及向量空间模型提取特征向量的中文文本分类器的设计与实现。
This paper introduces the design and implementation of the Chinese text categorizer based on the fuzzy recognition and the extraction of the characteristic vector with the vector space model.
提出一种基于改进的分类模型的文本分类系统来实现文本的自动分类。
A text classification system based on improved classification model presented in this paper is used to realize automatic text classification.
此外,本文还研究了基于向量空间模型的自动文本分类方法,提出了一个新的词权重计算方法,该方法有效提高了分类精度。
In addition, a text classification system based on Vector Space Model is studied and a new method for calculating word weight is proposed.
构建一个分类准确而且稳定的文本分类器是文本分类的关键,很多学者提出了不同的文本分类器模型和算法。
Constructing an accurate and stable text classifier is a key to text categorization. Many researchers put forward various text classifier models and algorithms.
在文本分类器的设计中,用传统信息检索的空间向量模型改进了朴素贝叶斯分类器,提高了它的分类精度。
In designing web Classifier, this thesis makes use of Vector Space Model to represent the web text, which improves the performance of Bayes Classifier.
第一,提出一种有监督的潜在语义索引(SLSI)模型降维方法,用于文本分类任务中的特征表示。
The main contributions include: 1 a novel dimension reduction method, Supervised Latent Semantic Indexing SLSI, was proposed to represent documents for text classification tasks.
本文对文本分类的关键技术及典型分类方法进行了研究,提出基于词向量空间模型的文本分类方法。
Research on the key techniques and typical methods of text categorization are being done, and the method of text categorization based on word vector space model is presented in the dissertation.
实验结果表明,采用IT领域模型的文本分类系统在查全率和查准率上都有显著地提高。
The experiment results show that the recall and the precision of the system which adopts IT field model are promoted prominently.
而对于超文本分类、信息检索,则给出了较为简单的模型构建方法。
As for the hypertext classification, information retrieval, we just give the approaches of creating a relatively model.
该文针对中文科技论文文本特殊的文体格式和语言风格进行了系统地研究,并提出了基于层次分类模型的文本分类算法。
In this paper, we construct firstly the interval estimates of variance components in the two-way model, depending on corresponding sums of squares from the analysis of variance.
这些算法和模型对今后研究文本分类以及其它文本处理问题将有很大的参考价值和借鉴作用。
The algorithms and models presented in this dissertation will be valuable for future studies in text classification and other fields in text processing.
然后,介绍了传统的基于关键字的向量空间模型的文本分类的几个重要阶段,并着重介绍了其中的文本表示的相关技术和两种经典分类算法。
Then, this paper eliminates ambiguity of word meanings in text by WordNet. A representation of text based on concept is proposed later, and has been also applied to classification in SVM and KNN.
然后,介绍了传统的基于关键字的向量空间模型的文本分类的几个重要阶段,并着重介绍了其中的文本表示的相关技术和两种经典分类算法。
Then, this paper eliminates ambiguity of word meanings in text by WordNet. A representation of text based on concept is proposed later, and has been also applied to classification in SVM and KNN.
应用推荐