subsetByReadIndices
Class: matlab.io.datastore.Subsettable
Namespace: matlab.io.datastore
Create subset of datastore or file-set with the specified read indices
Since R2022b
Syntax
subds = subsetByReadIndices(ds,indices)
Description
subds = subsetByReadIndices(
creates a subset of the specified datastore or file-set using the specified read indices. The
subset ds
,indices
)subds
is of the same type as the input.
Input Arguments
ds
— Input datastore or file-set
matlab.io.Datastore
object | FileSet
object | DsFileSet
object | BlockedFileSet
object
Input datastore or file-set, specified as a matlab.io.Datastore
, FileSet
,
DsFileSet
, or
BlockedFileSet
object.
indices
— Indices of files to include in subset
numeric vector | logical vector
Indices of files to include in the subset, specified as a numeric vector of indices
or a logical vector. The subsetByReadIndices
method creates a subset
subds
containing files corresponding to the elements in the logical
vector that have a value of true
.
numeric vector: Vector containing unique indices of files in the input datastore.
logical vector: Vector the same length as the number of files in the input datastore.
Examples
Build Datastore with Subset Support
Build a datastore with subset processing support and use it to bring your data into MATLAB®.
Create a class definition file that contains the code implementing your datastore. Save this file in your working folder or in a folder that is on the MATLAB path. The name of the .m
file must be the same as the name of your object constructor function. In this example, create the MyHDF5Datastore
class in a file named MyHDF5Datastore.m
. The .m
class definition contains the following steps:
Step 1: Inherit from the
matlab.io.Datastore
andmatlab.io.datastore.Subsettable
classes.Step 2: Define the constructor as well as the
subsetByReadIndices
andmaxpartitions
methods.Step 3: Define your custom file-reading function. Here, the
MyHDF5Datastore
class creates and uses thelistHDF5Datasets
function.
%% STEP 1 classdef MyHDF5Datastore < matlab.io.Datastore ... & matlab.io.datastore.Subsettable properties Filename (1, 1) string Datasets (:, 1) string {mustBeNonmissing} = "/" CurrentDatasetIndex (1, 1) double {mustBeInteger, mustBeNonnegative} = 1 end %% STEP 2 methods function ds = MyHDF5Datastore(Filename, Location) arguments Filename (1, 1) string Location (1, 1) string {mustBeNonmissing} = "/" end ds.Filename = Filename; ds.Datasets = listHDF5Datasets(ds.Filename, Location); end function [data, info] = read(ds, varargin) if ~hasdata(ds) error(message("No more datasets to read.")); end dataset = ds.Datasets(ds.CurrentDatasetIndex); data = { h5read(ds.Filename, dataset, varargin{:}) }; if nargout > 1 info = h5info(ds.Filename, dataset); end ds.CurrentDatasetIndex = ds.CurrentDatasetIndex + 1; end function tf = hasdata(ds) tf = ds.CurrentDatasetIndex <= numel(ds.Datasets); end function reset(ds) ds.CurrentDatasetIndex = 1; end end methods (Access = protected) function subds = subsetByReadIndices(ds, indices) datasets = ds.Datasets(indices); subds = copy(ds); subds.Datasets = datasets; reset(subds); end function n = maxpartitions(ds) n = numel(ds.Datasets); end end end %% STEP 3 function datasets = listHDF5Datasets(filename, location, args) arguments filename (1, 1) string location (1, 1) string args.IncludeSubGroups (1, 1) logical = true end if strlength(location) == 0 location = "/"; end info = h5info(filename, location); datasets = listDatasetsInH5infoStruct(info, location, IncludeSubGroups=args.IncludeSubGroups); end function datasets = listDatasetsInH5infoStruct(S, location, args) arguments S (1, 1) struct location (1, 1) string args.IncludeSubGroups (1, 1) logical = true end datasets = string.empty(0, 1); if isfield(S, "Datatype") datasets = location; elseif isfield(S, "Datasets") if ~isempty(S.Datasets) datasets = location + "/" + {S.Datasets.Name}'; end if args.IncludeSubGroups listFcn = @(group) listDatasetsInH5infoStruct(group, group.Name, IncludeSubGroups=true); else listFcn = @(group) string(group.Name); end childDatasets = arrayfun(listFcn, S.Groups, UniformOutput=false); childDatasets = vertcat(childDatasets{:}); datasets = [datasets; childDatasets]; end end
Extended Capabilities
Thread-Based Environment
Run code in the background using MATLAB® backgroundPool
or accelerate code with Parallel Computing Toolbox™ ThreadPool
.
Usage notes and limitations:
In a thread-based environment, you can use
subsetByReadIndices
only with the following datastores:ImageDatastore
objectsCombinedDatastore
,SequentialDatastore
, orTransformedDatastore
objects you create fromImageDatastore
objects by usingcombine
ortransform
You can use
subsetByReadIndices
with other datastores if you have Parallel Computing Toolbox™. To do so, run the function using a process-backed parallel pool instead of usingbackgroundPool
orThreadPool
(use eitherProcessPool
orClusterPool
).
For more information, see Run MATLAB Functions in Thread-Based Environment.
Version History
Introduced in R2022b
See Also
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)