Databases Reference
In-Depth Information
another way, can the research results be applied to a wider group than just those from
whom the data is collected? Reliability is chiefly concerned with making sure the
method of data gathering leads to consistent results.
Potpourri : The Texas sharpshooter notion is a logical fallacy in which informa-
tion that has no relationship is interpreted or manipulated until it appears to have
meaning.
The name comes from a joke about a Texan who fires some shots at the side
of a barn, then paints a target centered on the biggest cluster of hits and claims to
be a sharpshooter.
The fallacy applies to those situations where one does not have an ex ante or
prior expectation of the particular data relationship in question.
The fallacy comes from the tendency of people to see patterns where no real
pattern exists or where there is no basis to believe a pattern exists. It is related to
the clustering illusion, which refers to the tendency in human cognition to inter-
pret patterns in randomness where none actually exist.
Therefore, in building theory from empirical data (known as grounded theory), we
can examine data and possibly determine that some pattern or relationship exists.
To avoid the Texas sharpshooter fallacy, we would then explore other datasets
to see if the same pattern exists or test the correlation in a control experiment.
Note that we would need to use NEW data gathered under independent con-
ditions. If we use the same data in which we originally detected the pattern for
hypotheses testing, we would be committing the Texas sharpshooter fallacy.
(Note: Nothing against Texas. Lived there for two years and loved it!)
How do we address the issues of credibility, validity, and reliability? Based on
previous research [ 53 ], we know there are six questions that we must address in every
analysis project using trace data from sponsored-search logs.
Which data is analyzed? The analyst must clearly articulate in a precise man-
ner and format what trace data was recorded. With transaction log software, this
is much easier than in other forms of trace data, as logging applications can be
reverse-engineered to clearly articulate exactly what behavioral data is recorded.
How is this data defined? The analyst must clearly define each trace measure in
a manner that permits replication of the research on other systems and with other
users. As transaction log analysis has proliferated in a variety of venues, more
precise definitions of measures are developing [ 54 , 3 2 , 33 ].
What is the population from which the analyst has drawn the data? The analyst
must be cognizant of the actors, both people and systems, who created the trace
data. With transaction logs on the Web, this is sometimes a difficult issue to address
directly, unless the system requires some type of log-on from which profiles are then
available. In the absence of these profiles, the analyst must rely on demographic
surveys, studies of the system's user population, or general Web demographics.
What is the context in which the analyst analyzed the data? It is important for
the researcher to clearly articulate the environmental, situational, and contextual
 
Search WWH ::




Custom Search