Topic crawler based on dynamic topic base was proposed by studying on topic crawlers which filter URLs based on different strategies.
通过对基于不同策略过滤url的主题爬虫的研究,提出了一种基于动态主题库的主题爬虫。
This article provides a mixed strategy topic crawler which is based on network log analysis in order to adapt the dynamics and integrality of topic.
为适应主题的动态性和完整性,本文提出了一种基于网络日志分析的混合策略主题爬虫。
Then a topic crawler system was designed and implemented, employing topic sensitive Hyperlink-Induced Topic Search (HITS) to predict the priority of fetched Web pages.
在此基础上设计并实现了一个主题爬虫系统,该系统利用主题敏感HITS来计算网页优先级。
Simplifying the vector representation of documents and topic levels, a prototype of the focused crawler is designed and implemented.
对文档与主题层的向量表示进行简化,设计与实现了一个主题搜索机器人原型。
Topic web crawler search strategy is the core of professional search engine technology.
主题网络蜘蛛搜索策略是专业搜索引擎的核心技术。
The main goals of focused web crawler are to get more web pages which are correlative with a certain topic and prepare data for users querying.
聚焦网络爬虫并不追求大的覆盖,而将目标定为抓取与某一特定主题内容相关的网页,为面向主题的用户查询准备数据资源。
The main goals of focused web crawler are to get more web pages which are correlative with a certain topic and prepare data for users querying.
聚焦网络爬虫并不追求大的覆盖,而将目标定为抓取与某一特定主题内容相关的网页,为面向主题的用户查询准备数据资源。
应用推荐