Databases Reference
In-Depth Information
string as an attribute. That created a variable-length ASCII string ( “Variable-Length
Strings” on page 89 ).
In contrast, an instance of np.string_ would get stored as a fixed-length string in the
file:
>>> dset . attrs [ 'title_fixed' ] = np . string_ ( "Another title" )
This generally isn't an issue, but some older FORTRAN-based programs can't deal with
variable-length strings. If this is a problem for your application, use np.string_ , or
equivalently, arrays of NumPy type S .
By the way, you can also store Unicode strings in the file. They're written out with the
HDF5-approved UTF-8 encoding:
>>> dset . attrs [ 'Yet another title' ] = u'String with accent ( \u00E9 )'
>>> f . flush ()
Here's what the file looks like now, with our fixed-length and Unicode strings inside:
$ h5ls -vlr attrsdemo.hdf5/dataset
Opened "attrsdemo.hdf5" with sec2 driver.
dataset Dataset {100/100}
Attribute: Yet\ another\ title scalar
Type: variable-length null-terminated UTF-8 string
Data: "String with accent (\37777777703\37777777651)"
Attribute: ones scalar
Type: object reference
Data: DATASET-1:70568
Attribute: run_id scalar
Type: native int
Data: 144
Attribute: sample_rate scalar
Type: native double
Data: 1e+08
Attribute: title scalar
Type: variable-length null-terminated ASCII string
Data: "Dataset from third round of experiments"
Attribute: title_fixed scalar
Type: 13-byte null-padded ASCII string
Data: "Another title"
Location: 1:800
Links: 1
Storage: 400 logical bytes, 0 allocated bytes
Type: native float
There is one more thing to mention about strings, and it has to do with the strict sep‐
aration in Python 3 between byte strings and text strings.
When you read an attribute from a file, you generally get an object with the same type
as in HDF5. So if we were to store a NumPy int32, we would get an int32 back.
Search WWH ::




Custom Search