Java Reference
In-Depth Information
Table 14.2: Instance Variables for the Spider Class
Instance Variable
Purpose
cancel
A flag that indicates if this process should be canceled.
filters
Filters used to block specific URLs.
logger
The object that the spider reports its findings to.
options
The configuration options for the spider.
startTime
The time that the spider began.
stopTime
The time that the spider ended.
tasks
The BlockingQueue that will hold tasks for the thread pool.
threadPool
The Java thread executor that will manage the thread pool.
workloadManager
The workload manager, the spider can use any of several dif-
ferent workload managers. The workload manager tracks all
URL's found.
There are also a number of methods and functions that perform important tasks for the
Spider class. These will be discussed in the next few sections.
The Spider Constructor
The Spider class' constructor begins by saving the SpiderOptions and
WorkloadManager objects that was passed to instance variables. This will allow the
spider to refer to these important objects later.
this.options = options;
this.report = report;
Next, a workload manager is instantiated from the class name provided in the
SpiderOptions class. The init method is then called on the workload manager.
this.workloadManager = (WorkloadManager) Class.forName(
options.workloadManager).newInstance();
this.workloadManager.init(this);
report.init(this);
The thread pool is set up next. This uses the JDK 1.5 ThreadPoolExecutor to
implement the spider's thread pool. The thread pool is started with the options specified in
the SpiderOptions object.
Search WWH ::




Custom Search