Java Reference
In-Depth Information
The first exception is the SpiderException . The Spider class throws the
SpiderException when a severe error occurs. This must be a real error that prevents
the spider from continuing. Errors that are internal to individual web pages are not thrown
as spider errors.
The second exception is the WorkloadException . The WorkloadException
is thrown when there is an error with the workload. That can be an SQL exception, or another
communication error when dealing with an SQL based workload manager. Usually classes
external to the spider are not exposed to the WorkloadException . Rather, these
classes will throw the WorkloadException as a SpiderException .
Configuring the Spider
The SpiderOptions class is used to configure the spider. This class can accept
configuration directly from other Java classes when they modify the public properties on the
SpiderOptions object. The SpiderOptions class can also load configuration op-
tions from a file. Listing 14.2 shows the SpiderOptions class.
Listing 14.2: Configuring the Spider (SpiderOptions.java)
package com.heatonresearch.httprecipes.spider;
import java.io.*;
import java.lang.reflect.*;
import java.util.*;
public class SpiderOptions {
/**
* Specifies that when the spider starts up it should
* clear the workload.
*/
public static final String STARTUP_CLEAR = "CLEAR";
/**
* Specifies that the spider should resume processing its
* workload.
*/
public static final String STARTUP_RESUME = "RESUME";
/**
* How many milliseconds to wait when downloading pages.
*/
public int timeout = 60000;
Search WWH ::




Custom Search