Java Reference
In-Depth Information
try
{
ExtractPartial parse = new ExtractPartial();
parse.process();
} catch (Exception e)
{
e.printStackTrace();
}
}
}
This recipe works by downloading the first page, then following the “next page” links
until the end is reached.
Processing the First Page
The process method of the ExtractPartial class is used to access the first
page, and download subsequent pages. It is important to note that there are two process
methods in the ExtractPartial . The process method used to start downloading
is the process method that accepts no parameters. It begins by obtaining a URL to the
first page.
URL url = new URL("http://www.httprecipes.com/1/6/partial.php");
do
{
url = process(url);
} while (url != null);
The URL is passed to the process method that accepts a URL. This process method
returns the URL to the next page. This process continues until all pages have been down-
loaded.
Processing Individual Pages
The overloaded process method that accepts a URL is called for each partial-page that
is found. The method begins by creating some variables that will be needed to process the
page. The result variable holds the next partial-page, or null if there is no next page.
The buffer variable holds non-tag text encountered. The value variable holds the
href attribute for <a> tags found. The src variable holds the src attribute for <img>
tags encountered.
URL result = null;
StringBuilder buffer = new StringBuilder();
String value = "";
String src = "";
Search WWH ::




Custom Search