USING A SPIDER - HTTP Programming Recipes for Java Bots

Java Reference

In-Depth Information

spider.addURL(base, null, 1);

Once the spider has been created, it can be started by calling the process method. Once

the spider is finished, the status is displayed.

spider.process();

System.out.println(spider.getStatus());

The actual downloading is performed by the SpiderReport class, which is discussed in

the next section.

Receiving Data for the Download Spider

As in the last recipe, this spider is required to have a class that implements the

SpiderReportable interface. This object manages the spider and receives all infor-

mation found by the spider. The site Download spider uses the SpiderReport class to

implement the SpiderReportable interface.

Listing 13.7 shows the SpiderReport class.

Listing 13.7: Report Download Information (SpiderReport.java)

package com.heatonresearch.httprecipes.ch13.recipe2;

import java.io.*;

import java.net.*;

import com.heatonresearch.httprecipes.html.*;

import com.heatonresearch.httprecipes.spider.*;

public class SpiderReport implements SpiderReportable {

/*

* The base host. Only URL's from this host will be

* downloaded.

*/

private String base;

/*

* The local path to save downloaded files to.

*/

private String path;

/**

* Construct a SpiderReport object.

*

* @param path

* The local file path to store the files to.

*/

Search WWH ::

Custom Search

Home