Java Reference
In-Depth Information
HTTP encryption is supported through HTTPS. Any website that you access that makes
use of HTTPS will begin with the prefix https:// . HTTPS encrypts the data so that it
cannot be intercepted. Support for HTTPS is built into Java. You are only required to use an
HTTPS URL where you would normally use an HTTP URL. HTTPS encryption prevents a
third party from examining packets being exchanged between your browser and the web
server.
HTTP authentication allows the web server to prompt the user for a user id and pass-
word. The web server then can determine the identity of the user accessing it. Most web
sites do not use HTTP authentication, rather they use their own HTML forms to authenticate.
However, some sites make use of HTTP authentication, and to access these sites with a bot,
you will have to support HTTP authentication.
Java contains no support for HTTP authentication. Therefore, to support HTTP authen-
tication, you must add the header yourself. Additionally, the authentication header must be
encrypted. Base-64 is used to encrypt this header.
This chapter provided two recipes. The first determines if a URL is using HTTPS. The
second recipe downloads data from an HTTP authenticated site.
Up to this point, the chapters have shown you how to access data. In the next chapter we
will begin to learn what to do with HTTP data once you have retrieved it. Chapter 6 will show
how to parse HTML and extract data from forms, lists, tables and other HTML constructs.
Search WWH ::




Custom Search