Database Reference
In-Depth Information
The Reservoir Algorithm
Reservoir algorithms are a class of algorithms developed to sample from
streaming populations. They are closely related in concept to the
Fisher-Yatesshufflefromthelastsection.Thisismosteasilyseeninanearly
version of the Fisher-Yates shuffle:
public static <E> void durstenfeldPermute(E[] array) {
for ( int i=array.length-1;i>0;i--) {
int j = ( int ) Math . floor (i*rng.nextDouble());
E temp = array[i];array[i] = array[j];array[j] =
temp;
}
}
Inthisversionofthealgorithm,theindexisshuffledfrombacktofront.This
means that elements from farther out in the array are swapped into earlier
sections. It is this notion that leads to the basic reservoir algorithm.
In the reservoir algorithm, an array of elements that form the reservoir is
first allocated. This takes the role of the first n elements of the array when
sampling without replacement:
public class Reservoir<E> implements List<E> {
MersenneTwister rng = new MersenneTwister();
Object[] reservoir;
int N;
public Reservoir( int n) {
super ();
reservoir = new Object[n];
N = 0;
}
The first reservoir.length elements can simply be added to the array.
After that, a random position is drawn between N and 0. If that position
falls into the array, it replaces the element currently there; otherwise, it is
ignored:
public boolean add(E arg0) {
if (N < reservoir.length) {
Search WWH ::




Custom Search