Databases Reference
In-Depth Information
attribute), it's created with a fixed data type. Suppose you have multiple data products
in a file (for example, many datasets containing image data), and you want to be sure
each has exactly the same type.
HDF5 provides a native way to ensure this, by allowing you to save a data type to the
file independently of any particular dataset or attribute. When you call create_data
set , you supply the stored type and HDF5 will “link” the type to the brand new dataset.
The Datatype Object
You can create such an independent, or “named” type, by simply assigning a NumPy
dtype to a name in the file:
>>> f [ 'mytype' ] = np . dtype ( 'float32' )
When we open the named type, we don't get a dtype back, but something else:
>>> out = f [ 'mytype' ]
>>> out
<HDF5 named type "mytype" (dtype <f4)>
Like the Dataset object, this h5py.Datatype object is a thin proxy that allows access to
the underlying HDF5 datatype. The most immediately obvious property is Data
type.dtype , which returns the equivalent NumPy dtype object:
>>> out . dtype
dtype('float32')
Since they're full-fledged objects in the file, you have a lot of other properties as well:
>>> out . name
u'/mytype'
>>> out . parent
<HDF5 group "/" (6 members)>
Also available are .file ( h5py.File instance containing the type), .ref (object refer‐
ence to the type), and attributes, just like Dataset and Group objects:
>>> out . attrs [ 'info' ] = "This is an attribute on a named type object"
In the HDF5 world, for technical reasons named types are now called
committed types . You may hear both terms; for our purposes, they
mean the same thing.
Linking to Named Types
It's simple to create a dataset or attribute that refers to a named type object; just supply
the Datatype instance as the dtype:
>>> dset = f . create_dataset ( "typedemo" , ( 100 ,), dtype = f [ 'mytype' ])
Search WWH ::




Custom Search