h5create
Create HDF5 dataset
Description
Examples
Create Fixed-Size Dataset
Create a fixed-size 100-by-200-by-300 dataset myDataset
with full path /g1/g2/myDataset
.
h5create("myFile.h5","/g1/g2/myDataset",[100 200 300])
Write data to myDataset
. Because the dimensions of myDataset
are fixed, the amount of data to be written must match the size of the dataset.
myData = ones(100,200,300); h5write("myFile.h5","/g1/g2/myDataset",myData) h5disp("myFile.h5")
HDF5 myFile.h5 Group '/' Group '/g1' Group '/g1/g2' Dataset 'myDataset' Size: 100x200x300 MaxSize: 100x200x300 Datatype: H5T_IEEE_F64LE (double) ChunkSize: [] Filters: none FillValue: 0.000000
Create and Compare Datasets with Compression
Create two HDF5 files, each containing a 1000-by-2000 dataset. Use the deflate filter with maximum compression for the first dataset, and use the SZIP filter with entropy encoding for the second. You must specify a chunk size when applying compression filters.
h5create("myFileDeflate.h5","/myDatasetDeflate",[1000 2000], ... ChunkSize=[50 80],Deflate=9) h5create("myFileSZIP.h5","/myDatasetSZIP",[1000 2000], ... ChunkSize=[50 80],SZIPEncodingMethod="entropy")
Display the contents of the two files and observe the different filters.
h5disp("myFileDeflate.h5")
HDF5 myFileDeflate.h5 Group '/' Dataset 'myDatasetDeflate' Size: 1000x2000 MaxSize: 1000x2000 Datatype: H5T_IEEE_F64LE (double) ChunkSize: 50x80 Filters: deflate(9) FillValue: 0.000000
h5disp("myFileSZIP.h5")
HDF5 myFileSZIP.h5 Group '/' Dataset 'myDatasetSZIP' Size: 1000x2000 MaxSize: 1000x2000 Datatype: H5T_IEEE_F64LE (double) ChunkSize: 50x80 Filters: szip FillValue: 0.000000
Write randomized data to each dataset.
myData = rand([1000 2000]); h5write("myFileDeflate.h5","/myDatasetDeflate",myData) h5write("myFileSZIP.h5","/myDatasetSZIP",myData)
Compare the compression filters by examining the sizes of the resulting files. For this data, the deflate filter provides greater compression.
deflateListing = dir("myFileDeflate.h5"); SZIPListing = dir("myFileSZIP.h5"); deflateFileSize = deflateListing.bytes
deflateFileSize = 15117631
SZIPFileSize = SZIPListing.bytes
SZIPFileSize = 16027320
sizeRatio = deflateFileSize/SZIPFileSize
sizeRatio = 0.9432
Create Dataset with Unlimited Dimension
Create a two-dimensional dataset myDataset3
that is unlimited along the second dimension. You must specify the ChunkSize
name-value argument when setting any dimension of the dataset to Inf
.
h5create("myFile.h5","/myDataset3",[200 Inf],ChunkSize=[20 20])
Write data to myDataset3
. You can write data of any size along the second dimension because this dimension is unlimited. Additionally, because one dimension of the dataset is unlimited, you must specify the start
and count
arguments when writing data to the dataset.
myData = rand(200,500); h5write("myFile.h5","/myDataset3",myData,[1 1],[200 500])
Display the entire contents of the HDF5 file.
h5disp("myFile.h5")
HDF5 myFile.h5 Group '/' Dataset 'myDataset3' Size: 200x500 MaxSize: 200xInf Datatype: H5T_IEEE_F64LE (double) ChunkSize: 20x20 Filters: none FillValue: 0.000000
Input Arguments
filename
— Name of HDF5 file
string scalar | character vector
Name of the HDF5 file, specified as a string scalar or character vector. If
filename
does not already exist, then the
h5create
function creates the file.
Depending on the location to which you are writing, filename
can
take one of these forms.
Location | Form | ||||||
---|---|---|---|---|---|---|---|
Current folder | To write to the current folder, specify the name of the
file in Example:
| ||||||
Other folders | To write to a folder different from the current folder,
specify the full or relative path name in
Example:
Example:
| ||||||
Remote location | To write to a remote location, specify
Based on the remote location,
For more information, see Work with Remote Data. Example:
|
ds
— Dataset name
string scalar | character vector
Dataset name, specified as a string scalar or character vector containing
the full pathname of the dataset to be created. If you specify a dataset
that does not currently exist, then the h5create
function creates the dataset. Additionally, if you specify intermediate
groups that do not currently exist, then the h5create
function creates those groups.
Example: "/myDataset"
Example: "/g1/g2/myNestedDataset"
sz
— Dataset size
scalar | row vector
Dataset size, specified as a scalar or row vector. To specify an unlimited
dimension, specify the corresponding element of sz
as
Inf
. In this case, you must also specify
ChunkSize
.
Example: 50
Example: [2000 1000]
Example: [100 200 Inf]
Data Types: double
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Example: h5create("myFile.h5","/dataset1",[1000 2000],ChunkSize=[50
80],CustomFilterID=307,CustomFilterParameters=6)
creates the
1000-by-2000 dataset dataset1
in the HDF5 file
myFile.h5
using 50-by-80 chunks, the registered bzip2 filter
(identifier 307
), and a compression block size of
6
.
Datatype
— Data type of dataset
"double"
(default) | "single"
| "uint64"
| "uint32"
| "uint16"
| …
Data type of the dataset, specified as one of these values, representing MATLAB® data types:
"double"
"single"
"uint64"
"int64"
"uint32"
"int32"
"uint16"
"int16"
"uint8"
"int8"
"string"
Data Types: string
| char
ChunkSize
— Chunk size
scalar | row vector
Chunk size, specified as a scalar or row vector containing the
dimensions of the chunk. If any entry of sz
is
Inf
, then you must specify
ChunkSize
. The length of
ChunkSize
must equal the length of
sz
, and each entry of
ChunkSize
must be less than or equal to the
corresponding entry of sz
.
Example: 10
Example: [20 10 100]
Data Types: double
Deflate
— Deflate compression level
0
(default) | integer scalar value from 0 to 9
Deflate compression level, specified as an integer scalar value from 0
to 9. The default value of 0 indicates no compression. A value of 1
indicates the least compression, and a value of 9 indicates the most. If
you specify Deflate
, you must also specify
ChunkSize
.
You cannot specify both Deflate
and
SZIPEncodingMethod
in the same function
call.
Data Types: double
FillValue
— Fill value for missing data
0
(default) | numeric value
Fill value for missing data in numeric datasets, specified as a numeric value.
Data Types: double
| single
| uint8
| uint16
| uint32
| uint64
| int8
| int16
| int32
| int64
Fletcher32
— 32-bit Fletcher checksum filter
false
or 0
(default) | true
or 1
32-bit Fletcher checksum filter, specified as a numeric or logical
1
(true
) or
0
(false
). A Fletcher checksum
filter verifies that the transferred data in a file is error-free. If
you specify Fletcher32
, you must also specify
ChunkSize
.
Data Types: logical
| double
Shuffle
— Shuffle filter
false
or 0
(default) | true
or 1
Shuffle filter, specified as a numeric or logical 1
(true
) or 0
(false
). A shuffle filter improves the
compression ratio by rearranging the byte order of data stored in
memory. If you specify Shuffle
, you must also
specify ChunkSize
.
Data Types: logical
| double
TextEncoding
— Text encoding
"UTF-8"
(default) | "system"
Text encoding, specified as one of these values:
"UTF-8"
— Represent characters using UTF-8 encoding."system"
— Represent characters as bytes using the system encoding (not recommended).
Data Types: string
| char
CustomFilterID
— Filter identifier
positive integer
Filter identifier for the registered filter plugin assigned by The HDF Group, specified as a positive integer. For a list of registered filters, see the Filters page on The HDF Group website.
If you do not specify a value for CustomFilterID
,
then the dataset does not use dynamically loaded filters for
compression.
If you specify CustomFilterID
, you must also
specify ChunkSize
.
Data Types: double
| single
| uint8
| uint16
| uint32
| uint64
| int8
| int16
| int32
| int64
CustomFilterParameters
— Filter parameters
numeric scalar | numeric row vector
Filter parameters for third-party filters, specified as a numeric
scalar or numeric row vector. If you specify
CustomFilterID
without also specifying this
argument, then the h5create
function passes an
empty vector to the HDF5 library and the filter uses default
parameters.
This name-value argument corresponds to the
cd_values
argument of the
H5Pset_filter
function in the HDF5
library.
If you specify CustomFilterParameters
, you must
also specify CustomFilterID
.
Data Types: double
| single
| uint8
| uint16
| uint32
| uint64
| int8
| int16
| int32
| int64
SZIPEncodingMethod
— Encoding method for SZIP compression
"entropy"
| "nearestneighbor"
Since R2024b
Encoding method for SZIP compression, specified as
"entropy"
or
"nearestneighbor"
. The entropy
method is best suited for data that has already been processed; the
nearestneighbor
method preprocesses the data and
then applies the entropy
method. If you specify
SZIPEncodingMethod
, you must also specify
ChunkSize
.
You cannot specify both SZIPEncodingMethod
and
Deflate
in the same function call.
Data Types: string
| char
SZIPPixelsPerBlock
— Number of pixels per block for SZIP compression
16
(default) | even integer from 2 to 32
Since R2024b
Number of pixels (HDF5 data elements) per block for SZIP compression,
specified as an even integer from 2 to 32. If you specify
SZIPPixelsPerBlock
, you must also specify
SZIPEncodingMethod
. The value of
SZIPPixelsPerBlock
must be less than or equal
to the number of elements in each dataset chunk.
Example: 32
Data Types: double
| single
| uint8
| uint16
| uint32
| uint64
| int8
| int16
| int32
| int64
More About
Chunk Storage in HDF5
Chunk storage refers to a method of storing a dataset in memory by dividing it into smaller pieces of data known as chunks. Chunking a dataset can improve performance when operating on a subset of the dataset, since the chunks can be read and written to the HDF5 file individually.
Tips
To enable both the deflate and SZIP filters on the same dataset, use the low-level
H5P.set_deflate
andH5P.set_szip
functions.
Version History
Introduced in R2011aR2024b: Create datasets with SZIP compression
You can create datasets with SZIP compression by using the
SZIPEncodingMethod
and
SZIPPixelsPerBlock
name-value arguments.
R2022a: Use dynamically loaded filters to create dataset
You can use the CustomFilterID
and
CustomFilterParameters
name-value arguments to enable
compression using dynamically loaded filters.
R2020b: Create HDF5 files at a remote location
You can create HDF5 files in remote locations, such as Amazon S3, Windows Azure Blob Storage, and HDFS™.
R2020b: Create HDF5 files with Unicode names
You can create HDF5 files whose names are encoded as Unicode characters.
See Also
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)