Database Reference
In-Depth Information
double
n = (
double
)y.length;
b = (sumXY - (sumX*sumY)/n)/(sumX2 - (sumX*sumX)/n);
a = sumY/n - (b*sumX)/n;
}
Notice that the preceding function only requires a single pass over the data.
This means that it is also amenable to a simple streaming formulation,
useful for real-time analysis:
public class
StreamingSimpleLinearModel {
double
sumX = 0.0, sumY = 0.0;
double
sumXY = 0.0, sumX2 = 0.0;
double
n = 0.0;
boolean
dirty =
true
;
double
a = 0.0,b = 0.0;
private void
update() {
if
(!dirty)
return
;
b = (sumXY - (sumX*sumY)/n)/(sumX2 - (sumX*sumX)/
n);
a = sumY/n - b*sumX/n;
dirty =
false
;
}
public void
observe(
double
y,
double
x) {
sumX += x;sumY += y;
sumXY += x*y;sumX2 += x*x;
n += 1.0;
dirty =
true
;
}
public double
b() { update();
return
b; }
public double
a() { update();
return
a; }
}
Multivariate Linear Regression
Computing the best estimates for
B
when there is more than one
x
variable,
excluding the intercept term, is a bit more complicated, but it's still
straightforward. Using a technique called
ordinary least squares
, which is
sometimes used as a synonym for multivariate linear regression, the values
of
B
have a closed form so long as certain requirements are met.