Databases Reference
In-Depth Information
Figure 14: Identify target customers by categorization
groups that need to be extracted and identifi ed by their UIDs. Their specifi c path traversal
patterns are required for further study. The common web page(s) in all these path traversal
patterns that lead to unsuccessful registration are the critical web page(s) that required web
advertisement to infl uence the user behavior and be targeted to change their subsequent
path traversal patterns to stay on the web page #8 long enough for registration. Those web
pages that never led to page #8 could be considered to be contracted by revision in the
web content, merging, consolidation or even elimination, depending on individual cases
and further studies on the web page content. Many web pages impressed web surfers with
non-focused content or overwhelmed the surfers or e-customers with too much advertising
information. The OLAM method could assist in fi ltering only mission-critical web pages
to survive in the ultimate web site infrastructure. As our targeted result is a list of potential
e-customers for a certain product or service on a web site, with the associated rules derived,
we could trace this related knowledge by further analyzing the main tables in conjunction
with the discovered associate rules. We could classify those UIDs by web page sequence. As
the key of the main table — identifi cation code tells the UID (User ID), we could identify
the target e-customers further. We can even segment the target e-customers not only by their
web page preference, but also by their gender, occupation type, income range and age group.
As such, more customer-oriented web advertisement(s) could be placed in their preferred
web page(s) for more effective marketing.
PROTOTYPE
Here we demonstrate the process of online web usage mining. A university has a home
page that contains a lot of useful information (for example, course information, facilities
provided, etc.), which is distributed over several sub-pages. The person in charge wants to
know which sub-page is more popular and the whether the users who visited a particular
sub-page intended to visit other sub-pages. Then the person in-charge can post relevant
information or advertisements on the sub-pages more effectively.
The web log fi le was collected from the Computer Science Laboratory's web sites of City
University of Hong Kong. The site hosts a variety of information, ranging from department
information and department courses to individual web sites. We are only interested in fi ve
pages for analysis, as follows:
Search WWH ::




Custom Search