Java Reference
In-Depth Information
}
}
The process method of this recipe performs most of the actual work of submitting
the form and processing the response. Two parameters are passed to the process method:
the first is the search string to use and the second indicates whether we will be doing a state
search or a capital search.
The process method begins by setting up several variables that will be needed. The
states or capitals returned from the search will be in an HTML list. So the starting and end-
ing tags, which in this case are <ul> and </ul> are stored in the variables listType
and listTypeEnd . Additionally, a StringBuilder , named buffer is created to
hold the HTML text as it is encountered. The boolean capture variable indicates if text is
currently being captured to the StringBuilder .
String listType = "ul";
String listTypeEnd = "/ul";
StringBuilder buffer = new StringBuilder();
boolean capture = false;
The FormUtility class is designed to output to an OutputStream . For an HTTP
POST response, this would be fine. However, since this is an HTTP GET request the form data
must be encoded into the URL. To do this, we create a ByteArrayOutputStream .
This stream will allow the FormUtility to output the form data to an OutputStream ,
and once it's done we can obtain the formatted name-value pairs.
The three calls to the add method below, set up the different required name-value pairs
for the form.
// Build the URL.
ByteArrayOutputStream bos = new ByteArrayOutputStream();
FormUtility form = new FormUtility(bos, null);
form.add("search", search);
form.add("type", type);
form.add("action", "Search");
form.complete();
Next, the URL must be constructed. The URL is constructed by concatenating the output
from the ByteArrayOutputStream to the base URL. The URL can then be opened
and downloaded. A ParseHTML object is then created to parse the HTML.
String surl = "http://www.httprecipes.com/1/7/get.php?" + bos.to-
String();
URL url = new URL(surl);
InputStream is = url.openStream();
ParseHTML parse = new ParseHTML(is);
With the ParseHTML object set up, we can advance to the beginning of the HTML list.
The advance method was covered in Chapter 6, “Extracting Data”.
Search WWH ::




Custom Search