Databases Reference
In-Depth Information
Exercise: RealDirect Data Strategy
You have been hired as chief data scientist at realdirect.com , and report
directly to the CEO. The company (hypothetically) does not yet have
its data plan in place. It's looking to you to come up with a data strategy.
Here are a couple ways you could begin to approach this problem:
1. Explore its existing website, thinking about how buyers and sellers
would navigate through it, and how the website is structured/
organized. Try to understand the existing business model, and
think about how analysis of RealDirect user-behavior data could
be used to inform decision-making and product development.
Come up with a list of research questions you think could be an‐
swered by data:
• What data would you advise the engineers log and what would
your ideal datasets look like?
• How would data be used for reporting and monitoring product
usage?
• How would data be built back into the product/website?
2. Because there is no data yet for you to analyze (typical in a start-
up when its still building its product), you should get some aux‐
iliary data to help gain intuition about this market. For example,
go to https://github.com/oreillymedia/doing_data_science . Click
on Rolling Sales Update (after the fifth paragraph).
You can use any or all of the datasets here—start with Manhattan
August, 2012-August 2013.
• First challenge: load in and clean up the data. Next, conduct
exploratory data analysis in order to find out where there are
outliers or missing values, decide how you will treat them, make
sure the dates are formatted correctly, make sure values you
think are numerical are being treated as such, etc.
• Once the data is in good shape, conduct exploratory data anal‐
ysis to visualize and make comparisons (i) across neighbor‐
hoods, and (ii) across time. If you have time, start looking for
meaningful patterns in this dataset.
3. Summarize your findings in a brief report aimed at the CEO.
Search WWH ::




Custom Search