Java Reference
In-Depth Information
C HAPTER 2: E XAMINING HTTP T RAFFIC
• Using WireShark
• Using Network Analyzers for Debugging
• Analyzing Cookies
• Analyzing Forms
The goal of most bot programs is to access data that a web user could access with a
web browser. The advantage is that a bot is an automated program and can access a large
amount of data quickly. Creating bots can be challenging. If the data that is to be accessed is
on a single public web page, the task is easy. However, usually a bot must navigate through a
series of pages to find the data it needs.
Why would a bot need to navigate through several pages to access a piece of data? Per-
haps the most common reason is that some web sites require a user to log into the web serv-
er before they are allowed to get to the data they would like to view. Your bank would surely
require you to log into the bank web site, prior to viewing your bank balances. To access such
a site, the bot must be able to send the web server the same data in exactly the same format
as a regular browser session with a human user.
These more complex bots can be difficult to debug manually. Fortunately, by using a pro-
gram called a “Network Analyzer”, manual debugging is not necessary. Network analyzers
are also frequently referred to as “Packet Sniffers”.
Using a Network Analyzer
A network analyzer is a program that allows TCP/IP traffic between the web server and a
web browser to be monitored. With a network analyzer, a typical web browser accessing the
desired web server can be monitored. This shows exactly what information is transmitted.
The network analyzer is useful during all of the bot's development phases. Initially, the
network analyzer can be used to analyze a typical session with the desired web server. This
shows the HTTP requests and responses the bot must support. Once the bot is created, the
network analyzer is used again during the debugging process and then to verify the final
product.
Using a Network Analyzer to Design a Bot
The first step in designing a bot is to analyze the HTTP requests and responses that flow
between the web browser and web server. The bot will need to emulate these requests in
order to obtain the desired data from the web server.
 
Search WWH ::




Custom Search