Information Technology Reference
In-Depth Information
Analyzing information stored in such logs can help us capture the overall web us-
ages for testing, as described in Section 5 . Various measurements can be derived to
characterize web site workload at different levels of granularity and from different
perspectives. These workload measurements are used together with failure informa-
tion in Section 6 to evaluate web site operational reliability and the potential for
reliability improvement. Web logs also give us a starting point to characterize com-
mon problems for web-based applications, as we discuss below.
3 . 3
P r e l i m i n a r y A n a l y s i s o f C o m m o n P r o b l e m s f o r Tw o W e b
S i t e s
We next analyze common problems for www.seas.smu.edu , the official web site
of the School of Engineering and Applied Science at Southern Methodist Univer-
sity (SMU/SEAS). This web site utilizes Apache Web Server [4] , a popular choice
among many web hosts, and shares many common characteristics of web sites for
educational institutions. These features make our results and observations here and
in the rest of this chapter meaningful to many application environments. Server log
data covering 26 consecutive days in 1999 were used.
The use of this web site continues the trend of most previous studies that over-
whelmingly focus on academic sites [39] , which may not be a good representative
for many other web sites. For example, most of the SMU/SEAS web pages are static
ones, consisting primarily of the HTML documents and embedded graphics [25] , and
the web site operates under fairly light traffic. In e-commerce and other applications,
workload types may be more diverse, with dynamic pages and context-sensitive
contents play a much more important role, and traffic volume can be significantly
larger [39] . Therefore, we obtained recent (2003) web logs from the open source
KDE project web site www.kde.org to cross-validate our results [52] . The overall
user population and traffic volume are significantly larger, and changes are contin-
uously committed to the web site in order to provide the developers and users with
the most up-to-date information. These characteristics make it a good choice for
our validation study. On the other hand, this web site also uses Apache Web Server
[4] , which makes our data extraction and analysis easy due to the same data format
used.
Common web problems or error types are listed in Table I . Notice that most of
these errors conform closely to the source content failures we defined above, making
them suitable for our web problem characterization and reliability evaluation. Table I
also gives the summary of different types of errors for SMU/SEAS. The most domi-
nant error types are type A (“permission denied”) and type E (“file does not exist”),
which together account for 99.9% of the recorded errors.
Search WWH ::




Custom Search