Java Reference
In-Depth Information
e.printStackTrace();
}
}
/**
* Called when the spider is ready to process an HTML
* URL. Download the contents of the URL to a local file.
*
* @param url
* The URL that the spider is about to process.
* @param parse
* An object that will allow you you to parse the
* HTML on this page.
* @throws IOException
* Thrown if an IO error occurs while processing
* the page.
*/
public void spiderProcessURL(URL url, SpiderParseHTML parse)
throws IOException {
String filename =
URLUtility.convertFilename(this.path, url, true);
OutputStream os = new FileOutputStream(filename);
parse.getStream().setOutputStream(os);
parse.readAll();
os.close();
}
/**
* Called when the spider tries to process a URL but gets
* an error. This method is not used in tries manager.
*
* @param url
* The URL that generated an error.
*/
public void spiderURLError(URL url) {
}
}
Much of Recipe 13.2's SpiderReportable implementation is similar to Recipe 13.1.
However, unlike 13.1, Recipe 13.2 will actually download what it finds. This downloading func-
tionality is implemented in the two overloaded instances of the spiderProcessURL meth-
ods. The first spiderProcessURL method is designed to take an InputStream .
public void spiderProcessURL(URL url, InputStream stream)
Search WWH ::




Custom Search