Tables of Mixed Data
Store Related Data in Single Container
You can use the table
data type to collect mixed-type data and metadata properties, such as variable names, row names, descriptions, and variable units, in a single container. Tables are suitable for column-oriented or tabular data that is often stored as columns in a text file or in a spreadsheet. For example, you can use a table to store experimental data, with rows representing different observations and columns representing different measured variables.
Tables consist of rows and column-oriented variables. Variables in a table can have different data types and different sizes, but the variables must have the same number of rows. Also, the data within a variable is homogeneous, which enables you to treat a table variable like an array of data.
For example, load sample data about patients from the patients.mat
MAT-file. Combine blood pressure data into a single variable. Convert a four-category variable called SelfAssessedHealthStatus
—which has values of Poor
, Fair
, Good
, or Excellent
—to a categorical array. View information about several of the variables.
load patients BloodPressure = [Systolic Diastolic]; SelfAssessedHealthStatus = categorical(SelfAssessedHealthStatus); whos("Age","Smoker","BloodPressure","SelfAssessedHealthStatus")
Name Size Bytes Class Attributes Age 100x1 800 double BloodPressure 100x2 1600 double SelfAssessedHealthStatus 100x1 624 categorical Smoker 100x1 100 logical
Now, create a table from these variables and display it. The variables can be stored together in a table because they all have the same number of rows, 100.
T = table(Age,Smoker,BloodPressure,SelfAssessedHealthStatus)
T=100×4 table
Age Smoker BloodPressure SelfAssessedHealthStatus
___ ______ _____________ ________________________
38 true 124 93 Excellent
43 false 109 77 Fair
38 false 125 83 Good
40 false 117 75 Fair
49 false 122 80 Good
46 false 121 70 Good
33 true 130 88 Good
40 false 115 82 Good
28 false 115 78 Excellent
31 false 118 86 Excellent
45 false 114 77 Excellent
42 false 115 68 Poor
25 false 127 74 Poor
39 true 130 95 Excellent
36 false 114 79 Good
48 true 130 92 Good
⋮
Each variable in a table has one data type. If you add a new row to the table, MATLAB® forces consistency of the data type between the new data and the corresponding table variables. For example, if you try to add information for a new patient where the first column contains the patient's health status instead of age, as in the expression T(end+1,:) = {"Poor",true,[130 84],37}
, then you receive the error:
Right hand side of an assignment to a categorical array must be a categorical or text representing a category name.
The error occurs because MATLAB® cannot assign numeric data, 37
, to the categorical array, SelfAssessedHealthStatus
.
Access Data Using Numeric or Named Indexing
You can index into a table using parentheses, curly braces, or dot notation. Parentheses allow you to select a subset of the data in a table and preserve the table container. Curly braces and dot notation allow you to extract data from a table. Within each table indexing method, you can specify the rows or variables to access by name or by numeric index.
Consider the sample table from above. Each row in the table, T
, represents a different patient. The workspace variable, LastName
, contains unique identifiers for the 100 rows. Add row names to the table by setting the RowNames
property to LastName
and display the first five rows of the updated table.
T.Properties.RowNames = LastName; T(1:5,:)
ans=5×4 table
Age Smoker BloodPressure SelfAssessedHealthStatus
___ ______ _____________ ________________________
Smith 38 true 124 93 Excellent
Johnson 43 false 109 77 Fair
Williams 38 false 125 83 Good
Jones 40 false 117 75 Fair
Brown 49 false 122 80 Good
In addition to labeling the data, you can use row and variable names to access data in the table. For example, use named indexing to display the age and blood pressure of the patients Williams
and Brown
.
T(["Williams","Brown"],["Age","BloodPressure"])
ans=2×2 table
Age BloodPressure
___ _____________
Williams 38 125 83
Brown 49 122 80
Now, use numeric indexing to return an equivalent subtable. Return the third and fifth rows from the first and third variables.
T([3 5],[1 3])
ans=2×2 table
Age BloodPressure
___ _____________
Williams 38 125 83
Brown 49 122 80
For more information on table indexing, see Access Data in Tables.
Describe Data with Table Properties
In addition to storing data, tables have properties to store metadata, such as variable names, row names, descriptions, and variable units. You can access a property using T
.Properties.
PropName
, where T
is the name of the table and PropName
is the name of a table property.
For example, add a table description, variable descriptions, and variable units for Age
.
T.Properties.Description = "Simulated Patient Data"; T.Properties.VariableDescriptions = ... ["" ... "true or false" ... "Systolic/Diastolic" ... "Status Reported by Patient"]; T.Properties.VariableUnits("Age") = "Yrs";
Individual empty strings within VariableDescriptions
indicate that the corresponding variable does not have a description. For more information, see the Properties section of table
.
To print a table summary, use the summary
function.
summary(T)
T: 100x4 table Description: Simulated Patient Data Variables: Age: double (Yrs) Smoker: logical (34 true, true or false) BloodPressure: 2-column double (Systolic/Diastolic) SelfAssessedHealthStatus: categorical (4 categories, Status Reported by Patient) Statistics for applicable variables: NumMissing Min Median Max Mean Std Age 0 25 39 50 38.2800 7.2154 BloodPressure(:,1) 0 109 122 138 122.7800 6.7128 BloodPressure(:,2) 0 68 81.5000 99 82.9600 6.9325 SelfAssessedHealthStatus 0
Comparison to Cell Arrays
Like a table, a cell array can provide storage for mixed-type data in a single container. But unlike a table, a cell array does not provide metadata that describes its contents. It does not force data in its columns to remain homogenous. You cannot access the contents of a cell array using row names or column names.
For example, convert T
to a cell array using the table2cell
function. The output cell array contains the same data but has no information about that data. If it is important to keep such information attached to your data, then storing it in a table is a better choice than storing it in a cell array.
C = table2cell(T)
C=100×4 cell array
{[38]} {[1]} {[124 93]} {[Excellent]}
{[43]} {[0]} {[109 77]} {[Fair ]}
{[38]} {[0]} {[125 83]} {[Good ]}
{[40]} {[0]} {[117 75]} {[Fair ]}
{[49]} {[0]} {[122 80]} {[Good ]}
{[46]} {[0]} {[121 70]} {[Good ]}
{[33]} {[1]} {[130 88]} {[Good ]}
{[40]} {[0]} {[115 82]} {[Good ]}
{[28]} {[0]} {[115 78]} {[Excellent]}
{[31]} {[0]} {[118 86]} {[Excellent]}
{[45]} {[0]} {[114 77]} {[Excellent]}
{[42]} {[0]} {[115 68]} {[Poor ]}
{[25]} {[0]} {[127 74]} {[Poor ]}
{[39]} {[1]} {[130 95]} {[Excellent]}
{[36]} {[0]} {[114 79]} {[Good ]}
{[48]} {[1]} {[130 92]} {[Good ]}
⋮
To access subsets of data in a cell array, you can only use indexing with parentheses or curly braces.
C(1:5,1:3)
ans=5×3 cell array
{[38]} {[1]} {[124 93]}
{[43]} {[0]} {[109 77]}
{[38]} {[0]} {[125 83]}
{[40]} {[0]} {[117 75]}
{[49]} {[0]} {[122 80]}
Comparison to Structures
Structures also can provide storage for mixed-type data. A structure has fields that you can access by name, just as you can access table variables by name. However, it does not force data in its fields to remain homogenous. Structures do not provide any metadata to describe their contents.
For example, convert T
to a scalar structure where every field is an array, in a way that resembles table variables. Use the table2struct
function with the ToScalar
name-value argument.
S = table2struct(T,ToScalar=true)
S = struct with fields:
Age: [100x1 double]
Smoker: [100x1 logical]
BloodPressure: [100x2 double]
SelfAssessedHealthStatus: [100x1 categorical]
In this structure, you can access arrays of data by using field names.
S.Age
ans = 100×1
38
43
38
40
49
46
33
40
28
31
⋮
But to access subsets of data in the fields, you can only use numeric indices, and you can only access one field at a time. Table row and variable indexing provides more flexible access to data in a table.
S.Age(1:5)
ans = 5×1
38
43
38
40
49
See Also
table
| summary
| table2cell
| table2struct
| readtable