How to extract data from multiple HDF5 files

First, I need a script that will allow me to extract data from a specific range in a dataset of multiple (tens to hundreds) HDF5 files. So far I've gotten a script from the help section that will allow me to do so but for a single file at a time only.
Here's the code I have so far
plist = 'H5P_DEFAULT';
fid = H5F.open(filename);
dset_id = H5D.open(fid,'/Grid/precipitation');
dims = fliplr([5 3]);
mem_space_id = H5S.create_simple(2,dims,[]);
file_space_id = H5D.get_space(dset_id);
offset = fliplr([1043 3009]);
block = fliplr([5 3]);
H5S.select_hyperslab(file_space_id,'H5S_SELECT_SET',offset,[],[],block);
data = H5D.read(dset_id,'H5ML_DEFAULT',mem_space_id,file_space_id,plist);
data = data';
H5D.close(dset_id);
H5F.close(fid);
A possible solution I thought of doing is using a text file containing the filenames of the files in a directory and running the script on loop to extract the data from each file. I can make the text file using command prompt and have Matlab read it but I have little idea how to code the rest.
Second, after which, I also need to organize the data like so
I did this manually using MS Excel.
Note: Each file corresponds to a specific time interval whereas the indices in the dataset correspond to decimal degree coordinates.
I thank any who give a solution to either part of the problem.

 Accepted Answer

Assumptions:
  • all h5-files are in one folder, e.g. c:\my\data
  • all h5-files in that folder shall be read
  • all h5-files have the extension, '.h5'
Try something like
data = data_from_multiple_HDF5_files( 'c:\my\data', '*.h5' )
where in one m-file
function out = data_from_multiple_HDF5_files( folder, glob )
sad = dir( fullfile( folder, glob ) );
len = length( sad );
out = struct( 'name',cell(1,len), 'data',[] );
for jj = 1 : len
out(jj).name = sad(jj).name;
out(jj).data = read_one_file( fullfile( folder, sad(jj).name ) );
end
end
function data = read_one_file( filespec )
plist = 'H5P_DEFAULT';
fid = H5F.open( filespec );
dset_id = H5D.open(fid,'/Grid/precipitation');
dims = fliplr([5,3]); % magic numbers
mem_space_id = H5S.create_simple(2,dims,[]);
file_space_id = H5D.get_space(dset_id);
offset = fliplr([1043,3009]); % magic numbers
block = fliplr([5,3]); % magic numbers
H5S.select_hyperslab(file_space_id,'H5S_SELECT_SET',offset,[],[],block);
data = H5D.read(dset_id,'H5ML_DEFAULT',mem_space_id,file_space_id,plist);
data = data';
H5D.close(dset_id);
H5F.close(fid);
end

3 Comments

hello sir ,to load the h5 files do we need to replace the location of files in place of folder or can we use the same code
Sir to read the data stored in h5 files in a folder can we use this code ,what is we dont know the content inside the h5 files what should we type in place of dset_id = H5D.open(fid,'/Grid/precipitation')
I guess you can use data_from_multiple_HDF5_files(), but you have to write your own read_one_file().
I don't think that read_one_file() could be useful to you. You wrote in a recent question: "My h5 files consist of huge data related to atmospheric parameters (e.g. 3DSND_01MAY2018_0000_L2B_SA1.h5)". The function, read_one_file(), is made to read a particular kind of h5-files. It uses low-level h5-functions and it's full of magic numbers.

Sign in to comment.

More Answers (0)

Categories

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!