Java Reference
In-Depth Information
}
}
The
process
method of this recipe performs most of the actual work of submitting
the form and processing the response. Two parameters are passed to the process method:
the first is the search string to use and the second indicates whether we will be doing a state
search or a capital search.
The
process
method begins by setting up several variables that will be needed. The
states or capitals returned from the search will be in an HTML list. So the starting and end-
ing tags, which in this case are
<ul>
and
</ul>
are stored in the variables
listType
and
listTypeEnd
. Additionally, a
StringBuilder
, named
buffer
is created to
hold the HTML text as it is encountered. The
boolean
capture variable indicates if text is
currently being captured to the
StringBuilder
.
String listType = "ul";
String listTypeEnd = "/ul";
StringBuilder buffer = new StringBuilder();
boolean capture = false;
The
FormUtility
class is designed to output to an
OutputStream
. For an HTTP
POST
response, this would be fine. However, since this is an HTTP
GET
request the form data
must be encoded into the URL. To do this, we create a
ByteArrayOutputStream
.
This stream will allow the
FormUtility
to output the form data to an
OutputStream
,
and once it's done we can obtain the formatted name-value pairs.
The three calls to the add method below, set up the different required name-value pairs
for the form.
// Build the URL.
ByteArrayOutputStream bos = new ByteArrayOutputStream();
FormUtility form = new FormUtility(bos, null);
form.add("search", search);
form.add("type", type);
form.add("action", "Search");
form.complete();
Next, the URL must be constructed. The URL is constructed by concatenating the output
from the
ByteArrayOutputStream
to the base URL. The URL can then be opened
and downloaded. A
ParseHTML
object is then created to parse the HTML.
String surl = "http://www.httprecipes.com/1/7/get.php?" + bos.to-
String();
URL url = new URL(surl);
InputStream is = url.openStream();
ParseHTML parse = new ParseHTML(is);
With the
ParseHTML
object set up, we can advance to the beginning of the HTML list.
The
advance
method was covered in Chapter 6, “Extracting Data”.