Databases Reference
In-Depth Information
How can we get the value itself, without it being wrapped in a NumPy array? It turns
out there's another way to slice into NumPy arrays (and Dataset objects). You can index
with a somewhat bizarre-looking empty tuple:
>>> dset [()]
42
So keep these in your toolkit:
1. Using Ellipsis gives you all the elements in the dataset (always as an array object).
2. Using an empty tuple " () " gives you all the elements in the dataset, as an array object
for 1D and higher datasets, and as a scalar element for 0D datasets.
To make things even more confusing, you may see code in the wild
that uses the .value attribute of a dataset. This is a historical wart that
is exactly equivalent to doing dataset[()] . It's long deprecated and
not available in modern versions of h5py.
Boolean Indexing
In an earlier example, we used an interesting expression to set negative entries in a
NumPy array val to zero:
>>> val [ val < 0 ] = 0
This is an idiom in NumPy made possible by Boolean-array indexing . If val is a NumPy
array of integers, then the result of the expression val < 0 is an array of Booleans . Its
entries are True where the corresponding elements of val are negative, and False
elsewhere. In the NumPy world, this is also known as a mask .
Crucially, in both the NumPy and HDF5 worlds, you can use a Boolean array as an
indexing expression. This does just what you'd expect; it selects the dataset elements
where the corresponding index entries are True , and de-selects the rest.
In the spirit of the previous example, let's suppose we have a dataset initialized to a set
of random numbers distributed between -1 and 1:
>>> data = np . random . random ( 10 ) * 2 - 1
>>> data
array([ 0.98885498, -0.28554781, -0.17157685, -0.05227003, 0.66211931,
0.45692186, 0.07123649, -0.40374417, 0.22059144, -0.82367672])
>>> dset = f . create_dataset ( 'random' , data = data )
Let's clip the negative values to 0, by using a Boolean array:
>>> dset [ data < 0 ] = 0
>>> dset [ ... ]
Search WWH ::




Custom Search