Topic crawler based on dynamic topic base was proposed by studying on topic crawlers which filter URLs based on different strategies.
通过对基于不同策略过滤url的主题爬虫的研究,提出了一种基于动态主题库的主题爬虫。
This article provides a mixed strategy topic crawler which is based on network log analysis in order to adapt the dynamics and integrality of topic.
为适应主题的动态性和完整性,本文提出了一种基于网络日志分析的混合策略主题爬虫。
Then a topic crawler system was designed and implemented, employing topic sensitive Hyperlink-Induced Topic Search (HITS) to predict the priority of fetched Web pages.
在此基础上设计并实现了一个主题爬虫系统,该系统利用主题敏感HITS来计算网页优先级。
Simplifying the vector representation of documents and topic levels, a prototype of the focused crawler is designed and implemented.
对文档与主题层的向量表示进行简化,设计与实现了一个主题搜索机器人原型。
The main goals of focused web crawler are to get more web pages which are correlative with a certain topic and prepare data for users querying.
聚焦网络爬虫并不追求大的覆盖,而将目标定为抓取与某一特定主题内容相关的网页,为面向主题的用户查询准备数据资源。
The main goals of focused web crawler are to get more web pages which are correlative with a certain topic and prepare data for users querying.
聚焦网络爬虫并不追求大的覆盖,而将目标定为抓取与某一特定主题内容相关的网页,为面向主题的用户查询准备数据资源。
应用推荐