Web page classification was one of the hot study problems in the domain of Internet Search currently. Now there were the classifiers based on text and the hyperlinks.
网页自动分类是当前互联网搜索领域一个热点研究课题,目前主要有基于网页文本内容的分类和基于网页间超链接结构的分类。
Secondly, the system can distinguish the domain of the web page and understand the document at the concept level by text classification, clustering and concept extraction based machine learning.
其次,采用机器学习技术,包括文本分类、聚类,文本概念抽取,从概念层次理解文本信息;
This paper USES the classical vector space model for text classification Web page.
采用经典的向量空间模型对网页文本进行分类。
应用推荐