Database Reference
In-Depth Information
}
public double y( double [] x) {
double y = 0;
for ( int i=0;i<B.length;i++) y += B[i]*x[i];
return y;
}
}
“Fitting” these models means finding appropriate values of B given that the
values of x are observed along with y , which can be interpreted as a noisy
version of . In the original formulation this noise is normally distributed
(normal distributions are discussed in Chapter 9) with a mean of zero and
a variance of s . This is done by minimizing the square of the difference
between the observed values of y and the values of given x returned by the y
method of the LinearModel class.
In other words, the goal is to find an array for B that returns the least sum of
squares or “least squares error,” given in the following function:
public double error(double[] y,double[][] x) {
double error = 0.0;
for(int i=0;i<y.length;i++) {
double diff = y[i] - y(x[i]);
error += diff*diff;
}
return error;
}
This error is also known as the residual sum of squares (RSS) and is often
used to determine how well a model fits the data. Inspecting the individual
elements of the error (usually without the square) is used to determine
whether or not the model is well specified. Trends or a periodic signal in the
data is often a sign that the model is missing a term.
Simple Linear Regression
In the case that the x array is only ever two values, with the first value being
the constant 1 (known as the intercept term), there is a simple closed form
solution for the two values of the B array. When this happens, the first value
Search WWH ::




Custom Search