Information Technology Reference
In-Depth Information
difficult and time-consuming task. Furthermore, when there are changes in the moni-
tored system, these rules must be changed accordingly.
Clickstream analysis is another research area that is similar to our work of visually
presenting users' activities on the web. The difference between our work and this
research field lies in the main purposes of the two: our work focuses on security
events, while the other focuses on normal users' activities. Currently, clickstream
analysis is an important tool for electronic commerce websites owners. The merchants
use it to know more about their website visitors, such as: where they come from, how
they use the site, at which page they exit the site, at which page they decide to buy
products, etc. Most of the time, clickstream is presented in visual forms and can inter-
act with website owners. For example, Lee et al. propose two visualization methods
for clickstream analysis: parallel coordinate and starfield graph [9]. Parallel coordi-
nate is used to show the sequence of user steps happen on a website, such as look,
click, buy, etc. It also shows the number of users drop ratios after each action step.
Meanwhile, starfield graph is used to display product performance in terms of how
many times they are viewed and how many times they are clicked after viewing. An-
other work by Kawamoto and Itoh try to integrate and visualize users' access patterns
with existing website link structure for a better understanding about users' activities
[10]. Commerce solutions in this area are also available, such as Google Analytics
[11] and Webtrends [12].
3
System Architecture
Fig. 1 depicts the architecture of our web application attack scenario construction and
visualization system. The system receives data inputs from web server access log,
error log, IDS log, etc. This architecture is open so that new kinds of data can be add-
ed later to extend the system's capability. In addition to these data, which provide
information about users' activities on web applications, the system also receives web
applications' own structural data from an external crawler. This structural data con-
tains information about linking relationships between web pages. Structural data and
activities data are displayed in visual form to security administrators. Administrators
can interact with the visual interface to get adjusted information. These interactions
can affect both the scenario construction and scenario visualization components.
3.1
Data Collection and Preprocessing Component
This component collects users' activities data from various places. For each request
that is recorded, it should contain data about access time, URI of accessed page, IP
address, user agent, and query string. In our current prototype implementation, it just
collects data from a web application IDS and an access log file generated by the
Apache web server. In later versions, we intend to include more input sources. After
collecting data, there is a preprocessing step that is used to ensure that all the data
collected are standardized in a common format. Furthermore, because the collected
data is spread over many places, this preprocessing component puts them in a central
database for easy extracting and processing later.
Search WWH ::




Custom Search