Beyond Aggregation - Real-Time Analytics

Database Reference

In-Depth Information

return logit (y);

}

where the logit method has the following implementation:

public static double logit(double y) {

return 1.0/(1.0 + Math. exp (-y));

}

This is easy enough to implement when the values of B are known, but

finding appropriate values of B given observations of the input and output

is much more difficult than the multivariate linear regression model. The

most common approaches rely on so-called quasi-Newton techniques that

are best left to others to implement. Although a number of high-quality

numerical packages for C/C++ and FORTRAN exist for solving logistic

regression, Java implementations are much more rare.

Fitting Regression with Logistic Regression

If using an interface to one of the high-quality native approaches is not

an option, it is also possible to use a gradient descent approach to find

the appropriate values of B . This approach iterates over the data multiple

times, adjusting the values of each B by an amount proportional to the

sum of the errors between the predicted values and the actual y values for

each observation. This amount is controlled by a parameter alpha , which

is known as the “learning rate.” This parameter controls how quickly the

algorithm converges on a final answer and is usually set to something like

0.1 as a starting point. It must always be smaller than 1 and larger than 0.

public LogisticRegression fit( double [][] x, double [] y)

{

//Initialize the weights

B = new double [x[0].length];

double lastError = Double. POSITIVE_INFINITY ;

for ( int iter=0;iter<MAX_ITER;iter++) {

double err2 = 0;

for ( int i=0;i<x.length;i++) {

double t = y[i] - y(x[i]);

for ( int j=0;j<x[i].length;i++)

Search WWH ::

Custom Search

Home