Databases Reference
In-Depth Information
>>> dset = f . create_dataset ( 'unlimited' , ( 2 , 2 ), maxshape = ( 2 , None ))
>>> dset . shape
(2, 2)
>>> dset . maxshape
(2, None)
>>> dset . resize (( 2 , 3 ))
>>> dset . shape
(2, 3)
>>> dset . resize (( 2 , 2 ** 30 ))
>>> dset . shape
(2, 1073741824)
You can mark as many axes as you want as unlimited.
Finally, no matter what you put in maxshape , you can't change the total number of axes.
This value, the rank of the dataset, is fixed and can never be changed:
>>> dset . resize (( 2 , 2 , 2 ))
TypeError: New shape length (3) must match dataset rank (2)
Data Shuffling with resize
NumPy has a set of rules that apply when you change the shape of a dataset. For example,
take a simple four-element square array with shape (2, 2):
>>> a = np . array ([ [ 1 , 2 ], [ 3 , 4 ] ])
>>> a . shape
(2, 2)
>>> print a
[[1, 2]
[3, 4]]
If we now resize it to (1, 4), keeping the total number of elements unchanged, the values
are still there but rearrange themselves:
>>> a . resize (( 1 , 4 ))
>>> print a
[[1, 2, 3, 4]]
And finally if we resize it to (1, 10), adding new elements, the new ones are initialized
to zero:
>>> a . resize (( 1 , 10 ))
>>> print a
[[1 2 3 4 0 0 0 0 0 0]]
If you've reshaped NumPy arrays before, you're likely used to this reshuffling behavior.
HDF5 has a different approach. No reshuffling is ever performed. Let's create a Data
set object to experiment on, which has both axes set to unlimited:
>>> dset = f . create_dataset ( 'sizetest' , ( 2 , 2 ), dtype = np . int32 , maxshape = ( None ,
None))
>>> dset [ ... ] = [ [ 1 , 2 ], [ 3 , 4 ] ]
Search WWH ::




Custom Search