Databases Reference
In-Depth Information
intended during data collection. Secondary data analysis utilizes the data that was
collected by someone else. Transaction log data is commonly collected by Web sites
for system performance analysis. However, analysts can also use this data to address
other questions [ 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 ].
As a secondary-analysis method, SSA has several advantages. It efficiently uses
data collected by a Web site application. This gives the researcher access to a poten-
tially large sample of users over a significant duration, often allowing the researcher to
extend the scope of the study considerably [ 42 ]. Because the data is already collected,
the cost of existing transaction log data is cheaper than collecting primary data.
However, the use of secondary analysis is not without difficulties. Secondary data
is frequently not trivial to prepare, clean, and analyze, especially large transaction
logs. Analysts must often make assumptions about how the data was collected, as the
logging applications were developed by third parties. Additionally, there are the eth-
ical concerns of using transaction logs as secondary data. By definition, the analyst
is using the data in a manner that may violate the privacy of the system users. In fact,
some point out a growing distaste for unobtrusive methods due to increased sensitiv-
ity toward the ethics involved in such analysis [ 13 ].
Sponsored-Search Analytics as an Unobtrusive Method
Sponsored-search analytics have significant advantages as a methodology approach
for the study and investigation of behaviors. These advantages include:
Scale : Transaction log applications can collect data to a degree that overcomes the
critical limiting factor in laboratory user studies. User studies in laboratories are
typically restricted in terms of sample size, location, scope, and duration.
Power : The sample size of transaction log data can be quite large, so inference test-
ing can highlight statistically significant relationships. Interestingly, the amount
of data in transaction logs from the Web is sometimes so large that nearly every
relationship is significantly correlated due to the large power.
Scope : Because transaction log data is collected in a natural context, the research-
ers can investigate the entire range of user-system interactions or system function-
ality in a multivariable context.
Location : Transaction log data can be collected in naturalistic, distributed envi-
ronments. Therefore, the users do not have to be an artificial laboratory setting.
Duration : Because there is no need for specific participants to be recruited for a
user study, transaction log data can be collected over an extended period.
All methods of data collection have both strengths not available with other methods
and inherent limitations. Sponsored-search logs have several shortcomings. First,
transaction log data is not nearly as versatile relative to primary data, as the data may
not have been collected to answer the same research questions. Second, sponsored-
search data is not as rich as some other data-collection methods are and, therefore,
not available for investigating the range of concepts some researcher may want to
study. Third, the fields that the sponsored-search application records are often only
loosely linked to the concepts they are alleged to measure (e.g., a click is often used
Search WWH ::




Custom Search