Database Reference
In-Depth Information
Linear models
The core idea of linear models (or generalized linear models) is that we model the pre-
dicted outcome of interest (often called the target or dependent variable) as a function of a
simple linear predictor applied to the input variables (also referred to as features or inde-
pendent variables).
y = f(w T x)
Here, y is the target variable, w is the vector of parameters (known as the weight vector),
and x is the vector of input features.
w T x is the linear predictor (or vector dot product) of the weight vector w and feature vector
x . To this linear predictor, we applied a function f (called the link function).
Linear models can, in fact, be used for both classification and regression, simply by chan-
ging the link function. Standard linear regression (covered in the next chapter) uses an
identity link (that is, y = w T x directly), while binary classification uses alternative link
functions as discussed here.
Let's take a look at the example of online advertising. In this case, the target variable would
be 0 (often assigned the class label of -1 in mathematical treatments) if no click was ob-
served for a given advert displayed on a web page (called an impression). The target vari-
able would be 1 if a click occurred. The feature vector for each impression would consist of
variables related to the impression event (such as features relating to the user, web page,
advert and advertiser, and various other factors relating to the context of the event, such as
the type of device used, time of the day, and geolocation).
Thus, we would like to find a model that maps a given input feature vector (advert impres-
sion) to a predicted outcome (click or not). To make a prediction for a new data point, we
will take the new feature vector (which is unseen, and hence, we do not know what the tar-
get variable is) and compute the dot product with our weight vector. We will then apply the
relevant link function, and the result is our predicted outcome (after applying a threshold to
the prediction, in the case of some models).
Given a set of input data in the form of feature vectors and target variables, we would like
to find the weight vector that is the best fit for the data, in the sense that we minimize some
Search WWH ::




Custom Search