Main Content

filenames2labels

Get list of labels from filenames

    Description

    example

    lbls = filenames2labels(loc) creates a list of labels lbls based on the filenames in the specified location loc.

    example

    lbls = filenames2labels(ds) creates a list of labels based on the filenames contained in ds. ds can be a datastore, matlab.io.datastore.FileSet object, or matlab.io.datastore.BlockedFileSet object.

    example

    lbls = filenames2labels(___,Name=Value) specifies additional name-value arguments. For example, IncludeSubfolders=true includes subfolders in the scan for labels.

    [lbls,files] = filenames2labels(___) returns a list of files. The ith label in lbls corresponds to the ith file in files.

    Examples

    collapse all

    Specify the path to the sample signals included with Signal Processing Toolbox™.

    folder = fullfile(matlabroot,"examples","signal","data");

    Create a list of labels based on the .wav filenames located in folder.

    lbls = filenames2labels(folder,FileExtensions=".wav")
    lbls = 3x1 categorical
         guitartune 
         noisymusic 
         speech_dft 
    
    

    Specify the path to a subfolder within folder that contains a collection of chirp signals. Each signal is stored in a file called chirp_X, where X is an integer from 1 to 10. Use a pattern object to extract labels from the filenames containing a single digit.

    subfolder = fullfile(folder,"sample_chirps");
    p = "chirp_" + digitsPattern(1);
    
    plbls = filenames2labels(subfolder,Extract=p)
    plbls = 10x1 categorical
         chirp_1 
         chirp_1 
         chirp_2 
         chirp_3 
         chirp_4 
         chirp_5 
         chirp_6 
         chirp_7 
         chirp_8 
         chirp_9 
    
    

    Specify the path to the sample signals included with Signal Processing Toolbox™.

    folder = fullfile(matlabroot,"examples","signal","data");

    Create a signal datastore that points to the MAT-files located in folder.

    ds = signalDatastore(folder,FileExtensions=".mat");

    Create a list of labels based on the filenames contained in the datastore.

    lbls = filenames2labels(ds)
    lbls = 54x1 categorical
         BufferedHumanActivity 
         BufferedHumanactivity 
         EMGdata 
         EMGindex 
         HeartRates 
         Hello 
         HighModalDensData 
         INR 
         PNGait 
         PNGaitSegments 
         Ring 
         SignalAnalyzerEx1 
         SpeechTranscription 
         Transcription 
         WestAfricanEbolaOutbreak2014 
         Whale_Songs 
         ampoutput1 
         ampoutput2 
         batsignal 
         bostemp 
         clippedpeaks 
         clock 
         clocksig 
         dem_ROI 
         designedFilter 
         earthquake 
         ecg60Hz 
         ecgSignals 
         edfw 
         engineRPM 
          ⋮
    
    

    Input Arguments

    collapse all

    Files or folders to scan for labels, specified as a character vector, a cell array of character vectors, a string scalar, or a string array, containing the location of files or folders that are local or remote.

    • Local files or folders — Specify loc as a local path to files or folders. If the files are not in the current folder, then the local path must specify full or relative paths. Files within subfolders of the specified folder are included by default. You can use the wildcard character (*) when specifying the local path. This character specifies that the file search include all matching files or all files in the matching folders.

    • A remote location specified using an internationalized resource identifier (IRI).

    • Remote files or folders — Specify loc to be the full paths of the files or folders as a uniform resource locator (URL) of the form hdfs:///path_to_file. For more information, see Work with Remote Data.

    filenames2labels looks for all file formats. To specify a custom list of file extensions to scan, use the FileExtensions argument.

    Example: 'whale.mat'

    Example: '../dir/data/signal.mat'

    Example: "../dir/data/"

    Example: {'dataFiles/Files_1/' 'dataFiles/Files_2/'}

    Example: ["dataFiles/Files_1/" "dataFiles/Files_2/"]

    Data Types: char | string | cell

    Data repository, specified as a datastore, matlab.io.datastore.FileSet object, or matlab.io.datastore.BlockedFileSet object.

    • If you specify a datastore, then ds must contain a Files property from which label names are parsed.

    • If you specify a matlab.io.datastore.FileSet object, then the filenames2labels function obtains label names from the filenames listed in the FileInfo property of ds.

    • If you specify a matlab.io.datastore.BlockedFileSet object, then the filenames2labels function obtains the label names from the filenames in the BlockInfo property of ds.

    Name-Value Arguments

    Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

    Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

    Example: filenames2labels(loc,"ExtractBetween",[5 8])

    File extensions, specified as a string scalar, string vector, character vector, or cell array of character vectors. If you do not specify FileExtensions, then filenames2labels includes the filenames of all files found in the specified location in the list of labels.

    This argument applies only when the input is a file location.

    Example: [".mat" ".csv"]

    Data Types: char | string | cell

    Subfolder inclusion flag, specified as true or false. If you specify IncludeSubFolders as true, then filenames2labels includes subfolders in the scan for labels.

    This argument applies only when the input is a file location.

    Data Types: double | logical

    Delimiter that marks the end position for the extracted substring, specified as a string scalar, pattern object, or positive integer.

    • If you specify a string or pattern object, then the function extracts labels from each filename as the substring that begins with the first character of the filename and ends before the first occurrence of the delimiter string or pattern.

    • If you specify a positive integer, then the function extracts labels from each filename as the substring that begins with the first character of the filename and ends before the position specified by the delimiter index.

    If the string or pattern object is not found in a filename, or if the index is 1 or larger than length(char(filename))+1, then the function sets the label for that filename to undefined.

    Example: 3

    Example: "S"

    Example: digitsPattern + "_"

    Data Types: double | char | string

    Delimiter that marks the start position for the extracted substring, specified as a string scalar, pattern object, or nonnegative integer.

    • If you specify a string or pattern object, then the function extracts labels from each filename as the substring that begins after the first occurrence of the delimiter string or pattern and ends with the last character of the filename.

    • If you specify a nonnegative integer, then the function extracts labels from each filename as the substring that begins after the position specified by the delimiter index and ends with the last character of the filename.

    If the string or pattern object is not found in a filename, or if the index is larger than or equal to length(char(filename))+1, then the function sets the label for that filename to undefined.

    Example: 2

    Example: "Subject"

    Example: "_" + wildcardPattern

    Data Types: double | char | string

    Delimiter that marks the start and end positions for the extracted substring, specified as a two-element string vector or cell array of characters, a two-element vector of pattern objects, or a two-element vector of positive integers. For a delimiter equal to [P S]:

    • If you specify a two-element string or cell array of characters, or a two-element vector of pattern objects, then the function extracts labels from each filename as the substring that begins after P and ends before S.

    • If you specify a two-element vector of positive integers, then the function extracts labels from each filename as the characters indexed by P:S. S must be larger than or equal to P.

    If there are no characters between [P S], or if indices P or S are larger than the length of the filename, then the function sets the label for that filename to undefined.

    Example: [3 7]

    Example: ["A" "D"]

    Example: ["_" "_"]

    Data Types: double | char | string | cell

    Delimiter to find substring, specified as a pattern object. The function extracts labels from each filename as the substring that matches the pattern object.

    • If no match is found in a filename, then the function sets the label for that filename to undefined.

    • If more than one pattern is found per filename, then the function returns lbls as a matrix. All filenames must have the same number of pattern matches.

    Example: lettersPattern

    Example: "_" + wildcardPattern + "_"

    Output Arguments

    collapse all

    List of labels based on the names of files located in loc or contained in ds, returned as a categorical vector or matrix.

    List of files used to scan for labels, returned as a string vector. The ith file in files corresponds to the ith label in lbls.

    Version History

    Introduced in R2022b