Database Reference
In-Depth Information
return
logit
(y);
}
where the
logit
method has the following implementation:
public static double logit(double y) {
return 1.0/(1.0 + Math.
exp
(-y));
}
This is easy enough to implement when the values of
B
are known, but
finding appropriate values of
B
given observations of the input and output
is much more difficult than the multivariate linear regression model. The
most common approaches rely on so-called quasi-Newton techniques that
are best left to others to implement. Although a number of high-quality
numerical packages for C/C++ and FORTRAN exist for solving logistic
regression, Java implementations are much more rare.
Fitting Regression with Logistic Regression
If using an interface to one of the high-quality native approaches is not
an option, it is also possible to use a gradient descent approach to find
the appropriate values of
B
. This approach iterates over the data multiple
times, adjusting the values of each
B
by an amount proportional to the
sum of the errors between the predicted values and the actual
y
values for
each observation. This amount is controlled by a parameter
alpha
, which
is known as the “learning rate.” This parameter controls how quickly the
algorithm converges on a final answer and is usually set to something like
0.1 as a starting point. It must always be smaller than 1 and larger than 0.
public
LogisticRegression fit(
double
[][] x,
double
[] y)
{
//Initialize the weights
B =
new double
[x[0].length];
double
lastError = Double.
POSITIVE_INFINITY
;
for
(
int
iter=0;iter<MAX_ITER;iter++) {
double
err2 = 0;
for
(
int
i=0;i<x.length;i++) {
double
t = y[i] - y(x[i]);
for
(
int
j=0;j<x[i].length;i++)