Beyond Aggregation - Real-Time Analytics

Database Reference

In-Depth Information

array already stores the production of the unit error and its derivative, so it

only needs to be multiplied by the previous layer's activation value, which is

passed into the following implementation as the array o :

public double[] update(double[] o,double r) {

for(int i=0;i<v.length;i++) {

if(bW != null)

bW[i] += r*err[i];

double[] W = w[i];

for(int j=0;j<W.length;j++)

W[j] += r*err[i]*o[j];

}

return v;

}

The other value passed to the update method, r , is a “learning rate” similar

to the one used in the LogisticRegression example. This keeps the

weight adjustment from moving too quickly, which helps the stability of the

gradient descent method. Typical values for most networks are between 0.2

and 0.8, and it usually requires some trial and error to find the best rate.

Finally, before learning can begin there should be some initialization of

weights. By default, all of the weights in a network start with a value of

zero, but this can lead to the network being trapped in a local minimum and

unable to get to the “best” network that fits the data. Usually, it is best to

randomizetheweightsbeforetrainingasshowninthefollowingcode,which

is added to the Layer implementation:

public void initialize(Random rng) {

for ( int i=0;i<v.length;i++) {

for ( int j=0;j<w[i].length;j++)

w[i][j] = 2*rng.nextDouble() - 1;

bW[i] = 2*rng.nextDouble() - 1;

}

public void initialize() {

initialize( new Random());

}

Search WWH ::

Custom Search

Home