You can use a shorter or longer timeout based on how long you want the program to run, your network connection's speed, and the depth of the Web site you want to crawl.
根据想要让程序运行多久、网络连接速度和想对网站进行处理的深度,您可以使用更长或更短的超时。
To take this program to the next level and make it crawl a whole site, it's best to break it apart into smaller tasks.
要改进此程序,使其作用到整个网站,最好将其分解为小一些的任务。
In this command, the -r switch tells GNU Wget to recursively follow all links found on that page, so the tool will crawl the entire site.
在这个命令中,- r开关告诉GNUWget递归地跟踪那个页面上的所有链接,所以该工具将抓取整个站点。
To locate the largest images, run the general crawl for your site and then sort the table by "SIZE" column. This will bring up the largest files linked from your site pages.
要定位最大的图片,对你的站点运行常规抓取操作,然后对表格按照“尺寸(SIZE)”列进行排序,这样会将链接到你站点的最大的文件提到最前面。
If you want to crawl a web site and index that, you can use htdig2omega, which accepts a URL and searches the entire web site.
如果想要对一个web站点进行爬行并索引,可以使用htdig2omega,它能接受一个URL并搜索整个web站点。
Doing a retrofit on your shiny new Flash-based site after it is built won't cut it. Spiders can crawl text, not Flash or images.
让它建立别削减后,在你闪闪发光基于Flash的网站上做一个更新。
By adding fresh content at regular intervals, you're creating signals to the engines to crawl more often and also create more potential search phrases users can find your site for.
通过定期添加新内容,你就有了更多机会在搜索引擎中稳步上升,而且还可以创造出能让更多用户找到网站的搜索词语。
You'll collect a series of links from a single Web page and then crawl a whole site.
您将从单个网页收集一系列链接,然后蔓延到整个网站。
If vandalizing one site takes a number of seconds rather than a small fraction of a second, it becomes less appealing to crawl Wikis spewing junk.
如果破坏一个站点需要的时间是很多秒,而不是一秒的一小部分,那么检索 Wiki 找出无用信息就不那么引人注目了。
Sitemaps 0.90 is a simple and intuitive way for webmasters to provide the right level of information to Web crawlers so that they can efficiently crawl a Web site.
Sitemaps 0.90为Web 站点管理员提供了简单而且直观的方法,使他们可以向web爬虫程序提供正确的信息,从而使爬虫程序能够高效地爬行某个 Web 站点。
Site Map of the three major factors: the text, links, key words, are extremely beneficial to the search engines crawl the main page content.
网站地图的三大因素:文本、连结、关键词,都极其有利于搜寻引擎抓取主要页面内容。
We know is the role of site map search engine crawl to content, crawl through the site map submission, thus completed grab included, the technique is used in large and giant web site.
我们很清楚网站地图的作用是将搜索引擎没有抓取到的内容,通过网站地图提交进行抓取,从而完成抓取收录,该技巧多被用于大型和巨型网站。
On top of that, your shopping cart has slowed to a crawl because thousands of packets of useless data are slamming into your Web site, creating annoyance and outright anger in your customer base.
此外,您的购物车也慢得跟蜗牛一样,因为数千个无用数据包正在冲击您的Web站点,这也使客户烦恼和愤怒。
However, if the document written by a problem, may cause search engines can not crawl your site.
但如果这个文件写的有问题,可能会造成搜索引擎完全不能爬行你的网站。
Often a bony lanky black spider will crawl around your web site wiggle its little feet in excitement gather up your email address and store it near its pinchers.
通常,一只瘦长的“黑蜘蛛”会在网站周围徘徊,兴奋地挥舞着腿脚,窃取你的电子邮箱地址。
In general, the search engine spiders will follow the link to your site to crawl all pages.
一般来说,搜索引擎蜘蛛会跟着链接爬行到你网站的所有网页。
Due to be completed next crawl will be noted that the new site, changes to existing sites, and dead links, and content changes in the search results to be adjusted.
稍后完成的下一次抓取会注意到新网站、对现有网站的更改以及失效的链接,并对内容的变化在搜索结果中加以调整。
Developing a site map is a simple way of giving search engines the information they need to crawl your entire website.
建立站点地图是告诉搜索引擎它需要爬行的你的整个网站的内容的简单方法。
Fortunately, this tool will allow you to enter a URL and it will crawl through the site and look for links that return 404 pages.
幸运的是,这个工具可以让您输入网址,并将通过网站检索和查找链接,返回404页。
Note that the web site crawler cannot crawl secured pages.
请注意,该网站的抓取工具无法抓取安全网页。
Note that the web site crawler cannot crawl secured pages.
请注意,该网站的抓取工具无法抓取安全网页。
应用推荐