More About Types - Python and HDF5

Databases Reference

In-Depth Information

>>> with h5py . File ( 'bool.hdf5' , 'w' ) as f2 :

... f . create_dataset ( 'bool' , ( 100 ,), dtype = np . bool )

And now let's see how it looks in the file, again using h5ls :

Opened "bool.hdf5" with sec2 driver.

/ Group

Location: 1:96

Links: 1

/bool Dataset {100/100}

Location: 1:800

Links: 1

Storage: 100 logical bytes, 0 allocated bytes

Type: enum native signed char {

FALSE = 0

TRUE = 1

}

The array Type

Not often encountered in NumPy code, the array type is a good choice when you want

to store multiple values of the same type in a single element. Unlike compound types,

there are no separate “fields”; rather, each element is itself a multidimensional array.

There are a couple of pitfalls associated with this type and with some “helpful” behavior

from NumPy, which can be confusing. Let's start with an example, in which our elements

are 2×2 arrays of floats:

>>> dt = np . dtype ( '(2,2)f' )

>>> dt

dtype(('float32',(2, 2)))

Now let's create an HDF5 dataset with this dtype that has 100 data points:

>>> dset = f . create_dataset ( 'array' , ( 100 ,), dtype = dt )

>>> dset . dtype

dtype(('float32',(2, 2)))

>>> dset . shape

(100,)

Retrieving a single element gives us a 2x2 NumPy array:

>>> out = dset [ 0 ]

>>> out

array([[ 0., 0.],

[ 0., 0.]], dtype=float32)

You might have expected a NumPy scalar with our original dtype, but it doesn't work

that way. NumPy automatically “promotes” the array-type scalar into a full-fledged array

of the base type. This is convenient, but it's another case where dset[…].dtype !=

dset.dtype .

Search WWH ::

Custom Search

Home