Databases Reference
In-Depth Information
Thought Experiment: Meta-Definition
Every class had at least one thought experiment that the students
discussed in groups. Most of the thought experiments were very open-
ended, and the intention was to provoke discussion about a wide va‐
riety of topics related to data science. For the first class, the initial
thought experiment was: can we use data science to define data science?
The class broke into small groups to think about and discuss this
question. Here are a few interesting things that emerged from those
conversations:
Start with a text-mining model.
We could do a Google search for “data science” and perform a text-
mining model. But that would depend on us being a usagist rather
than a prescriptionist with respect to language. A usagist would let
the masses define data science (where “the masses” refers to what‐
ever Google's search engine finds). Would it be better to be a pre‐
scriptionist and refer to an authority such as the Oxford English
Dictionary ? Unfortunately, the OED probably doesn't have an en‐
try yet, and we don't have time to wait for it. Let's agree that there's
a spectrum, that one authority doesn't feel right, and that “the
masses” doesn't either.
So what about a clustering algorithm?
How about we look at practitioners of data science and see how
they describe what they do (maybe in a word cloud for starters)?
Then we can look at how people who claim to be other things like
statisticians or physicists or economists describe what they do.
From there, we can try to use a clustering algorithm (which we'll
use in Chapter 3 ) or some other model and see if, when it gets as
input “the stuff someone does,” it gives a good prediction on what
field that person is in.
Just for comparison, check out what Harlan Harris recently did related
to the field of data science: he took a survey and used clustering to
define subfields of data science , which gave rise to Figure 1-4 .
Search WWH ::




Custom Search