SIMPLE REQUESTS - HTTP Programming Recipes for Java Bots

Java Reference

In-Depth Information

The main portion of this program is contained in a method named go . The following

three lines do the main work performed by the go method.

URL u = new URL("http://www.httprecipes.com/1/3/time.php");

String str = downloadPage(u);

System.out.println(extract(str, "", "", 1));

First, a URL object is constructed with the URL that we are to download from. This URL

object is then passed to the downloadPage function.

Using the downloadPage function from the last recipe, we can download the above

HTML into a string. Now that the above data is in a string, you may ask - what is the easiest

way to extract the date and time? Any Java string parsing method can be used to do this.

However, this recipe provides one very useful function to do this, named extract . The

contents of the extract function is shown here:

int location1, location2;

location1 = location2 = 0;

do

{

location1 = str.indexOf(token1, location1);

if (location1 == -1)

return null;

count--;

} while (count > 0);

location2 = str.indexOf(token2, location1 + 1);

if (location2 == -1)

return null;

return str.substring(location1 + token1.length(), location2);

As you can see from above, the extract function is passed the string to parse, includ-

ing the beginning and ending tags. The extract function will then scan the specified

string, and find the beginning tag. In this case, the beginning tag is . Once the beginning

tag is found, the extract function will return all text found until the ending tag is found.

It is important to note that the beginning and ending text need not be HTML tags. You

can use any beginning and ending text you wish with the extract function.

You might also notice that the extract function accepts a number as its last param-

eter. In this case, the number passed was one. This number specifies which instance of the

beginning text to locate. In this example there was only one to find. What if there were

several? Passing in a two would have located the text at the second instance of the tag.

Search WWH ::

Custom Search

Home