Main Content

summary

Print summary of table, timetable, or categorical array

Description

example

summary(T) prints a summary of the table or timetable T.

  • If T is a table, then the table summary displays the description from T.Properties.Description followed by a summary of the table variables.

  • If T is a timetable, then the timetable summary displays the description from T.Properties.Description, a summary of the row times, and then a summary of the timetable variables.

example

s = summary(T) returns a structure, s, that contains a summary of the input table or timetable. Each field of s is itself a structure that summarizes the values in the corresponding variable of T. If T is a timetable, then s also has a field that summarizes the row times of T.

example

summary(A) prints a summary of the categorical array A.

  • If A is a vector, then summary(A) displays the category names along with the number of elements in each category (the category counts). It also displays the number of elements that are undefined.

  • If A is a matrix, then summary treats the columns of A as vectors and displays the category counts for each column of A.

  • If A is a multidimensional array, then summary acts along the first array dimension whose size does not equal 1.

example

summary(A,dim) prints the category counts of the categorical array A along dimension dim.

For example, you can display the counts of each row in a categorical array using summary(A,2).

Examples

collapse all

Create a table.

load patients
BloodPressure = [Systolic Diastolic];
T = table(Gender,Age,Smoker,BloodPressure,'RowNames',LastName);

Add descriptions and units to table T. You can add a description for the table as a whole, and also for individual variables.

T.Properties.Description = 'Simulated patient data';
T.Properties.VariableUnits =  {''  'Yrs' ''  'mm Hg'};
T.Properties.VariableDescriptions{4} = 'Systolic/Diastolic';

Print a summary of table T.

format compact

summary(T)
Description:  Simulated patient data
Variables:
    Gender: 100x1 cell array of character vectors
    Age: 100x1 double
        Properties:
            Units:  Yrs
        Values:
            Min          25   
            Median       39   
            Max          50   
    Smoker: 100x1 logical
        Values:
            True        34   
            False       66   
    BloodPressure: 100x2 double
        Properties:
            Units:  mm Hg
            Description:  Systolic/Diastolic
        Values:
                      Column 1    Column 2
                      ________    ________
            Min         109           68  
            Median      122         81.5  
            Max         138           99  

summary displays the minimum, median, and maximum values for each column of the variable BloodPressure.

Create a small timetable.

Time = [seconds(1:5)]';
TT = timetable(Time,[98;97.5;97.9;98.1;97.9],[120;111;119;117;116],...
               'VariableNames',{'Reading1','Reading2'})
TT=5×2 timetable
    Time     Reading1    Reading2
    _____    ________    ________

    1 sec        98        120   
    2 sec      97.5        111   
    3 sec      97.9        119   
    4 sec      98.1        117   
    5 sec      97.9        116   

Print a summary of the timetable. summary prints a summary of the row times, followed by a summary of the variables. If the timetable is regular, then summary also prints the size of the time step between row times.

summary(TT)
RowTimes:

    Time: 5x1 duration
        Values:
            Min           1 sec 
            Median        3 sec 
            Max           5 sec 
            TimeStep      1 sec 

Variables:

    Reading1: 5x1 double

        Values:

            Min         97.5  
            Median      97.9  
            Max         98.1  

    Reading2: 5x1 double

        Values:

            Min         111   
            Median      117   
            Max         120   

Create a table. Add units to the table variables. Then display the first few rows.

load patients
BloodPressure = [Systolic Diastolic];
T = table(Gender,Age,Smoker,BloodPressure,'RowNames',LastName);
T.Properties.VariableUnits =  {''  'Years' ''  'mm Hg'};
head(T,3)
                  Gender      Age    Smoker    BloodPressure
                __________    ___    ______    _____________

    Smith       {'Male'  }    38     true       124     93  
    Johnson     {'Male'  }    43     false      109     77  
    Williams    {'Female'}    38     false      125     83  

Return a summary of the table. To return a summary as a structure, specify an output argument when using the summary function.

s = summary(T)
s = struct with fields:
           Gender: [1x1 struct]
              Age: [1x1 struct]
           Smoker: [1x1 struct]
    BloodPressure: [1x1 struct]

Display the summary of the table variable Age. For each variable of T, the output argument s has a field that contains its summary.

s.Age
ans = struct with fields:
           Size: [100 1]
           Type: 'double'
    Description: ''
          Units: 'Years'
     Continuity: []
            Min: 25
         Median: 39
            Max: 50
     NumMissing: 0

The NumMissing field shows the number of elements that are the missing value. In this case, Age does not contain any NaN values, so NumMissing is zero. summary includes the NumMissing field for numeric, duration, datetime, and categorical variables.

Display the minimum age contained in the table. You can access any field of the summary by name.

s.Age.Min
ans = 25

Display the summary of the table variable Smoker. You can determine the numbers of smokers and nonsmokers from the True and False fields. The information contained in the summary of a table variable depends on the data type of the variable.

s.Smoker
ans = struct with fields:
           Size: [100 1]
           Type: 'logical'
    Description: ''
          Units: ''
     Continuity: []
           True: 34
          False: 66

Create a timetable.

Time = datetime({'2015-12-18 08:00:00';'2015-12-18 10:00:00';'2015-12-18 12:00:00'});
Temp = [37.3;39.1;42.3];
Pressure = [30.1;30.03;29.9];
TT = timetable(Time,Temp,Pressure)
TT=3×2 timetable
            Time            Temp    Pressure
    ____________________    ____    ________

    18-Dec-2015 08:00:00    37.3      30.1  
    18-Dec-2015 10:00:00    39.1     30.03  
    18-Dec-2015 12:00:00    42.3      29.9  

Return a summary of the timetable as a structure.

s = summary(TT)
s = struct with fields:
        Time: [1x1 struct]
        Temp: [1x1 struct]
    Pressure: [1x1 struct]

Display the summary of the row times. The TimeStep field shows that the time interval between consecutive row times is two hours. The NumMissing field shows there are no missing values (NaT) in the vector of row times.

s.Time
ans = struct with fields:
          Size: [3 1]
          Type: 'datetime'
           Min: 18-Dec-2015 08:00:00
        Median: 18-Dec-2015 10:00:00
           Max: 18-Dec-2015 12:00:00
    NumMissing: 0
      TimeStep: 02:00:00

Change the last row time so that the row times have different intervals between them.

TT.Time(3) = '2015-12-18 11:00:00';
TT
TT=3×2 timetable
            Time            Temp    Pressure
    ____________________    ____    ________

    18-Dec-2015 08:00:00    37.3      30.1  
    18-Dec-2015 10:00:00    39.1     30.03  
    18-Dec-2015 11:00:00    42.3      29.9  

Return a summary of the updated timetable. Since the time steps between row times are different, the TimeStep field has a NaN.

s = summary(TT);
s.Time
ans = struct with fields:
          Size: [3 1]
          Type: 'datetime'
           Min: 18-Dec-2015 08:00:00
        Median: 18-Dec-2015 10:00:00
           Max: 18-Dec-2015 11:00:00
    NumMissing: 0
      TimeStep: NaN

Starting in R2018b, you can add custom properties to tables and timetables. If you add custom properties, then the summary of a table or timetable includes those properties.

First, create a table and add values to some of its predefined properties.

load patients
BloodPressure = [Systolic Diastolic];
T = table(Gender,Age,Smoker,BloodPressure,'RowNames',LastName);
T.Properties.Description = 'Simulated patient data';
T.Properties.VariableUnits =  {''  'Yrs' ''  'mm Hg'};
T.Properties.VariableDescriptions{4} = 'Systolic/Diastolic';

Add custom properties using the addprop function. For each custom property, specify a name. Also, specify whether the value of each custom property stores metadata that applies to the table or to individual table variables.

T = addprop(T,{'SourceFile','DataOrigin'},{'table','variable'});

Store metadata values in the custom properties.

T.Properties.CustomProperties.SourceFile = 'patients.mat';
T.Properties.CustomProperties.DataOrigin = {'census','census','self report','blood pressure reading'};

Print a summary of the table. Aside from T.Properties.Description, the summary function does not display properties that apply to the table as a whole. So, it does not display the value of T.Properties.CustomProperties.SourceFile. However, summary does display properties that apply to table variables. For each variable, summary displays the corresponding value from T.Properties.CustomProperties.DataOrigin.

summary(T)
Description:  Simulated patient data

Variables:

    Gender: 100x1 cell array of character vectors

        Custom Properties:
            DataOrigin:  census
    Age: 100x1 double

        Properties:
            Units:  Yrs
        Custom Properties:
            DataOrigin:  census
        Values:

            Min          25   
            Median       39   
            Max          50   

    Smoker: 100x1 logical

        Custom Properties:
            DataOrigin:  self report
        Values:

            True        34   
            False       66   

    BloodPressure: 100x2 double

        Properties:
            Units:  mm Hg
            Description:  Systolic/Diastolic
        Custom Properties:
            DataOrigin:  blood pressure reading
        Values:
                      Column 1    Column 2
                      ________    ________

            Min         109           68  
            Median      122         81.5  
            Max         138           99  

Return the summary as a structure. Each field has a structure corresponding to one of the table variables.

s = summary(T)
s = struct with fields:
           Gender: [1x1 struct]
              Age: [1x1 struct]
           Smoker: [1x1 struct]
    BloodPressure: [1x1 struct]

The structure s.Age stores the summary for the Age variable.

s.Age
ans = struct with fields:
                Size: [100 1]
                Type: 'double'
         Description: ''
               Units: 'Yrs'
          Continuity: []
                 Min: 25
              Median: 39
                 Max: 50
          NumMissing: 0
    CustomProperties: [1x1 struct]

The s.Age.CustomProperties structure stores the corresponding value from the T.Properties.CustomProperties.DataOrigin property.

s.Age.CustomProperties
ans = struct with fields:
    DataOrigin: {'census'}

Create a 1-by-5 categorical vector.

A = categorical({'plane' 'car' 'train' 'car' 'plane'})
A = 1x5 categorical
     plane      car      train      car      plane 

A has three categories, car, plane, and train.

Print a summary of A.

summary(A)
     car      plane      train 
     2        2          1     

car appears in two elements of A, plane appears in two elements, and train appears in one element.

Since A is a row vector, summary lists the occurrences of each category horizontally.

Create a 4-by-2 categorical array, A, from a numeric array.

X = [1 3; 2 1; 3 1; 4 2];
valueset = 1:3;
catnames = {'red','green','blue'};

A = categorical(X,valueset,catnames)
A = 4x2 categorical
     red              blue  
     green            red   
     blue             red   
     <undefined>      green 

A has three categories, red, green, and blue. The value, 4, was not included in the valueset input to the categorical function. Therefore, the corresponding element, A(4,1), does not have a corresponding category and is undefined.

Print a summary of A.

summary(A)
     red              1      2 
     green            1      1 
     blue             1      1 
     <undefined>      1      0 

red appears once in the first column of A and twice in the second column.

green appears once in the first column of A and once in the second column.

blue appears once in the first column of A and once in the second column.

A contains only one undefined element. It occurs in the first column.

Create a 3-by-2 categorical array, A, from a numeric array.

A = categorical([1 3; 2 1; 3 1],1:3,{'red','green','blue'})
A = 3x2 categorical
     red        blue 
     green      red  
     blue       red  

A has three categories, red, green, and blue.

Print a summary of A along the second dimension.

summary(A,2)
     red      green      blue 
     1        0          1    
     1        1          0    
     1        0          1    

red appears once in the first row of A, once in the second row, and once in the third row.

green appears in only one element. It occurs in the second row of A.

blue appears once in the first row of A and once in the third row.

Input Arguments

collapse all

Input table, specified as a table or a timetable.

Categorical array, specified as a vector, matrix, or multidimensional array.

Dimension of A to operate to along, specified as a positive integer scalar. If no value is specified, the default is the first array dimension whose size does not equal 1.

Consider a two-dimensional categorical array A:

If dim = 1, then summary(A,dim) displays the category counts for each column of A.

If dim = 2, then summary(A,dim) displays the category counts for each row of A.

If dim is greater than ndims(A), then summary(A) returns an array the same size as A for each category. summary returns 1 for elements in the corresponding category and 0 otherwise.

Output Arguments

collapse all

Summary of the table or timetable variables, returned as a scalar structure. For each variable T.VarName in the input T, the output structure s contains a field s.VarName with the summary for that variable.

If T has variables whose names are not valid MATLAB® identifiers, then summary modifies them to create valid field names, primarily by removing spaces and replacing non-ASCII characters with underscores.

For each data type, s.VarName contains the fields shown below. You can access the fields with dot indexing. For example, s.VarName.Size returns the size of the table variable named VarName.

Type of Table or Timetable Variable

Fields for Summary of Variable

Description

Numeric, datetime, or duration

Size

Size of variable, stored as a numeric array

Type

Type of variable, stored as a character vector

Description

Description of variable, stored as a character vector

Units

Units of variable, stored as a character vector

Min

Minimum value

Median

Median value

Max

Maximum value

NumMissing

Number of missing values (NaN or NaT)

CustomProperties (omitted if there are no custom properties)

Names and values for custom properties associated with variable, stored as a structure

logical

Size

Size of variable, stored as a numeric array

Type

Type of variable, stored as a character vector

Description

Description of variable, stored as a character vector

Units

Units of variable, stored as a character vector

True

Number of true values

False

Number of false values

CustomProperties (omitted if there are no custom properties)

Names and values for custom properties associated with variable, stored as a structure

categorical

Size

Size of variable, stored as a numeric array

Type

Type of variable, stored as a character vector

Description

Description of variable, stored as a character vector

Units

Units of variable, stored as a character vector

Categories

Categories, stored as a cell array of character vectors

Counts

Number of elements in each category, stored as a numeric array

NumMissing

Number of missing values (<undefined>)

CustomProperties (omitted if there are no custom properties)

Names and values for custom properties associated with variable, stored as a structure

Other

Size

Size of variable, stored as a numeric array

Type

Type of variable, stored as a character vector

Description

Description of variable, stored as a character vector

Units

Units of variable, stored as a character vector

CustomProperties (omitted if there are no custom properties)

Names and values for custom properties associated with variable, stored as a structure

If T is a timetable, then s also has a field with a summary of the row times. For timetable row times only, the summary includes the TimeStep field. If the row times increase or decrease monotonically by a fixed time step, then TimeStep has a numeric value. If the row times are irregular, then TimeStep is NaN.

Fields for Summary of Timetable Row Times

Description of Fields

Size

Size of vector of row times, stored as a numeric array

Type

Data type, stored as a character vector

Min

Minimum value

Median

Median value

Max

Maximum value

NumMissing

Number of missing values (NaT or NaN)

TimeStep

Time step between consecutive row times (NaN if irregular)

More About

collapse all

Table Summary

The table summary displays the table description from T.Properties.Description followed by information on the variables of T.

The summary contains the following information on the variables:

  • Name: Size and Data Type — Variable name from T.Properties.VariableNames, the size of the variable, and the data type of the variable.

  • Units — Variable units from T.Properties.VariableUnits.

  • Description — Variable description from T.Properties.VariableDescriptions.

  • Custom Properties: — Names of the custom properties that apply to variables, and their corresponding values, from T.Properties.CustomProperties. If there are no custom properties, then this section is omitted.

  • Values — Only included for numeric, logical, categorical, datetime, or duration variables.

    • Numeric, datetime, or duration variables — minimum, median, and maximum values. Also, the number of missing values (NaNs or NaTs) is included when that number is greater than zero.

    • Logical variables — number of values that are true and the number of values that are false.

    • categorical variables — number of elements from each category. Also, the number of undefined elements is included when that number is greater than zero.

If T is a timetable, then the summary contains the same information on the vector of row times.

Extended Capabilities

Version History

Introduced in R2013b