Java Reference
In-Depth Information
The main portion of this program is contained in a method named
go
. The following
three lines do the main work performed by the
go
method.
URL u = new URL("http://www.httprecipes.com/1/3/time.php");
String str = downloadPage(u);
System.out.println(extract(str, "<b>", "</b>", 1));
First, a
URL
object is constructed with the URL that we are to download from. This
URL
object is then passed to the
downloadPage
function.
Using the
downloadPage
function from the last recipe, we can download the above
HTML into a string. Now that the above data is in a string, you may ask - what is the easiest
way to extract the date and time? Any Java string parsing method can be used to do this.
However, this recipe provides one very useful function to do this, named
extract
. The
contents of the
extract
function is shown here:
int location1, location2;
location1 = location2 = 0;
do
{
location1 = str.indexOf(token1, location1);
if (location1 == -1)
return null;
count--;
} while (count > 0);
location2 = str.indexOf(token2, location1 + 1);
if (location2 == -1)
return null;
return str.substring(location1 + token1.length(), location2);
As you can see from above, the
extract
function is passed the string to parse, includ-
ing the beginning and ending tags. The
extract
function will then scan the specified
string, and find the beginning tag. In this case, the beginning tag is
<b>
. Once the beginning
tag is found, the
extract
function will return all text found until the ending tag is found.
It is important to note that the beginning and ending text need not be HTML tags. You
can use any beginning and ending text you wish with the
extract
function.
You might also notice that the
extract
function accepts a number as its last param-
eter. In this case, the number passed was one. This number specifies which instance of the
beginning text to locate. In this example there was only one
<b>
to find. What if there were
several? Passing in a two would have located the text at the second instance of the
<b>
tag.