Databases Reference
In-Depth Information
given parameter. Catledge and Pitkow (1995) have studied user page view time over WWW
and have recommended thirty minutes as a reasonable time interval between requests within
a user session. Figure 7 illustrates the inferred user sessions from log data.
Step 1.3: Data Warehousing
After fi nishing the data preprocessing, with the removal of all the irrelevant records
from the web log, all the cleaned data are stored in the main table for further process. We
store the web usage in a data warehouse such that the log of accessing the target web page
and its previous web pages are analyzed as traversal patterns. The possibility of these pages
being accessed together is very likely. These web pages of user access paths records are
stored in the fact table of the data warehouse, with their dates stored in the dimension table.
The algorithm for recording user access paths into data warehouse is shown in Figure 8.
Figure 9 shows the star schema of web usage in access path for an interval in a period.
For any user with an UID or IP address, there are many navigation paths for the user browsing
the web site. For example, if the access path is P1, P2 and P3 in sequential order, its web
page access path becomes from P1 to P2 to P3. (Note: Frequency pattern count is the number
of browsed frequency of the path.)
We apply the attribute event in the constraint class of the frame model metadata to
automate the data warehouse data cube continuously and incrementally. For example, the
dimension table and the fact table are as follows:
Dimension table Time relation R TIME
Dimension table Time relation R TIME
Dimension table Time Page relation R TARGET
Dimension table Time Page relation R TARGET
Year
Month
Day
Target Page
Count
Year1
Month1
Day1
T1
C1
Fact table destination relation R
Fact table destination relation R
Fact FACT (Date)
Target Page Date CPB CFP FP Count(CPB) Count(CFP) Count(FP) Duration
T1
Date1
Path 1 Path 2 Path 3
C1
C2
C3
D1
Fact FACT (Month)
Target Page Month CPB CFP FP Count(CPB) Count(CFP) Count(FP) Duration
T1
Fact table destination relation R
Fact table destination relation R
Month1 Path 1 Path 2 Path 3
C1
C2
C3
D1
Fact FACT (Year)
Target Page Year CPB CFP FP Count(CPB) Count(CFP) Count(FP) Duration
T1
Fact table destination relation R
Fact table destination relation R
Year1
Path 1 Path 2 Path 3
C1
C2
C3
D1
To be updated dimension table tuple δR (data to be updated to data warehouse)
Dimension table Time relation R' TIME
Year
Dimension table Time Page relation R' TARGET
Target Page
Month
Day
Count
Month2
Year2
Day2
T2
C2
 
Search WWH ::




Custom Search