Databases Reference
In-Depth Information
Let's back up. Why would you even want to build a linear model in the
first place? You might want to use this relationship to predict future
outcomes, or you might want to understand or describe the relation‐
ship to get a grasp on the situation. Let's say you're studying the rela‐
tionship between a company's sales and how much that company
spends on advertising, or the number of friends someone has on a
social networking site and the time that person spends on that site
daily. These are all numerical outcomes, which mean linear regression
would be a wise choice, at least for a first pass at your problem.
One entry point for thinking about linear regression is to think about
deterministic lines first. We learned back in grade school that we could
describe a line with a slope and an intercept, y = f x = β 0 + β 1 * x . But
the setting there was always deterministic.
Even for the most mathematically sophisticated among us, if you ha‐
ven't done it before, it's a new mindset to start thinking about stochastic
functions. We still have the same components: points listed out ex‐
plicitly in a table (or as tuples), and functions represented in equation
form or plotted on a graph. So let's build up to linear regression starting
from a deterministic function.
Example 1. Overly simplistic example to start . Suppose you run a
social networking site that charges a monthly subscription fee of $25,
and that this is your only source of revenue. Each month you collect
data and count your number of users and total revenue. You've done
this daily over the course of two years, recording it all in a spreadsheet.
You could express this data as a series of points. Here are the first four:
S =
x , y = 1, 25 , 10, 250 , 100, 2500 , 200, 5000
If you showed this to someone else who didn't even know how much
you charged or anything about your business model (what kind of
friend wasn't paying attention to your business model?!), they might
notice that there's a clear relationship enjoyed by all of these points,
namely y = 25 x . They likely could do this in their head, in which case
they figured out that:
• There's a linear pattern.
• The coefficient relating x and y is 25.
• It seems deterministic.
Search WWH ::




Custom Search