Java Reference
In-Depth Information
"workload.sql.SQLWorkloadManager");
spiderOptions.startup("clear");
spiderOptions.filters.add(
"com.heatonresearch.httprecipes.spider.filter.RobotsFilter");
You can also configure the spider using a configuration file. This is discussed in the next
section.
Configuring with a Configuration File
It is often more convenient to use a configuration file than directly setting the values
of the SpiderOptions . To use a configuration file, create a text file that contains a
single line for each configuration option. Each configuration option is a name-value pair. A
colon(:) separates the name and value. The name corresponds to the property name in the
SpiderOptions class. See Table 13.1 for a complete list of configuration options. Also,
a sample configuration file is shown in Listing 13.1.
Listing 13.1: A Configuration file for the Spider (spider.conf)
timeout: 60000
maxDepth: -1
userAgent:
corePoolSize: 100
maximumPoolSize: 100
keepAliveTime: 60
dbURL: jdbc:mysql://192.168.1.10/spider
dbClass: com.mysql.jdbc.Driver
dbUID: root
dbPWD: test
workloadManager:com.heatonresearch.httprecipes.spider.workload.
sql.SQLWorkloadManager
startup: clear
filter: com.heatonresearch.httprecipes.spider.filter.RobotsFilter
Once the configuration file is setup, it is relatively easy to tell the spider to make use of it.
Simply call the load method on the SpiderOptions object as follows:
SimpleReport report = new SimpleReport();
SpiderOptions options = new SpiderOptions();
options.load("c:\\spider.conf");
Spider spider = new Spider(options,report);
Once the SpiderOptions object has been loaded, it is passed to the spider's con-
structor. You will also notice that a report variable is passed which the spider uses to report
its findings. This will be discussed in the next section.
Search WWH ::




Custom Search