fitlme

Fit linear mixed-effects model

Syntax

lme = fitlme(tbl,formula)

lme = fitlme(tbl,formula,Name,Value)

Description

lme = fitlme(tbl,formula) returns a linear mixed-effects model, specified by formula, fitted to the variables in the table or dataset array tbl.

example

lme = fitlme(tbl,formula,Name,Value) returns a linear mixed-effects model with additional options specified by one or more Name,Value pair arguments.

For example, you can specify the covariance pattern of the random-effects terms, the method to use in estimating the parameters, or options for the optimization algorithm.

example

Examples

collapse all

Fit Linear Mixed-Effects Model

Open Live Script

Load the sample data.

load imports-85

Store the variables in a table.

tbl = table(X(:,12),X(:,14),X(:,24),'VariableNames',{'Horsepower','CityMPG','EngineType'});

Display the first five rows of the table.

tbl(1:5,:)

ans=5×3 table
    Horsepower    CityMPG    EngineType
    __________    _______    __________

       111          21           13    
       111          21           13    
       154          19           37    
       102          24           35    
       115          18           35

Fit a linear mixed-effects model for miles per gallon in the city, with fixed effects for horsepower, and uncorrelated random effect for intercept and horsepower grouped by the engine type.

lme = fitlme(tbl,'CityMPG~Horsepower+(1|EngineType)+(Horsepower-1|EngineType)');

In this model, CityMPG is the response variable, horsepower is the predictor variable, and engine type is the grouping variable. The fixed-effects portion of the model corresponds to 1 + Horsepower, because the intercept is included by default.

Since the random-effect terms for intercept and horsepower are uncorrelated, these terms are specified separately. Because the second random-effect term is only for horsepower, you must include a –1 to eliminate the intercept from the second random-effect term.

Display the model.

lme

lme = 
Linear mixed-effects model fit by ML

Model information:
    Number of observations             203
    Fixed effects coefficients           2
    Random effects coefficients         14
    Covariance parameters                3

Formula:
    CityMPG ~ 1 + Horsepower + (1 | EngineType) + (Horsepower | EngineType)

Model fit statistics:
    AIC       BIC     LogLikelihood    Deviance
    1099.5    1116    -544.73          1089.5  

Fixed effects coefficients (95% CIs):
    Name                   Estimate    SE         tStat     DF     pValue        Lower       Upper    
    {'(Intercept)'}          37.276     2.8556    13.054    201    1.3147e-28      31.645       42.906
    {'Horsepower' }        -0.12631    0.02284     -5.53    201    9.8848e-08    -0.17134    -0.081269

Random effects covariance parameters (95% CIs):
Group: EngineType (7 Levels)
    Name1                  Name2                  Type           Estimate    Lower     Upper 
    {'(Intercept)'}        {'(Intercept)'}        {'std'}        5.7338      2.3773    13.829

Group: EngineType (7 Levels)
    Name1                 Name2                 Type           Estimate    Lower      Upper  
    {'Horsepower'}        {'Horsepower'}        {'std'}        0.050357    0.02307    0.10992

Group: Error
    Name               Estimate    Lower     Upper 
    {'Res Std'}        3.226       2.9078    3.5789

Note that the random-effects covariance parameters for intercept and horsepower are separate in the display.

Now, fit a linear mixed-effects model for miles per gallon in the city, with the same fixed-effects term and potentially correlated random effect for intercept and horsepower grouped by the engine type.

lme2 = fitlme(tbl,'CityMPG~Horsepower+(Horsepower|EngineType)');

Because the random-effect term includes the intercept by default, you do not have to add 1, the random effect term is equivalent to (1 + Horsepower|EngineType).

Display the model.

lme2

lme2 = 
Linear mixed-effects model fit by ML

Model information:
    Number of observations             203
    Fixed effects coefficients           2
    Random effects coefficients         14
    Covariance parameters                4

Formula:
    CityMPG ~ 1 + Horsepower + (1 + Horsepower | EngineType)

Model fit statistics:
    AIC     BIC       LogLikelihood    Deviance
    1089    1108.9    -538.52          1077    

Fixed effects coefficients (95% CIs):
    Name                   Estimate    SE          tStat      DF     pValue        Lower      Upper    
    {'(Intercept)'}         33.824       4.0181     8.4178    201    7.1678e-15     25.901       41.747
    {'Horsepower' }        -0.1087     0.032912    -3.3029    201     0.0011328    -0.1736    -0.043806

Random effects covariance parameters (95% CIs):
Group: EngineType (7 Levels)
    Name1                  Name2                  Type            Estimate    Lower       Upper   
    {'(Intercept)'}        {'(Intercept)'}        {'std' }          9.4952      4.7022      19.174
    {'Horsepower' }        {'(Intercept)'}        {'corr'}        -0.96843    -0.99568    -0.78738
    {'Horsepower' }        {'Horsepower' }        {'std' }        0.078874    0.039917     0.15585

Group: Error
    Name               Estimate    Lower     Upper 
    {'Res Std'}        3.1845      2.8774    3.5243

Note that the random effects covariance parameters for intercept and horsepower are together in the display, and it includes the correlation ('corr') between the intercept and horsepower.

Fit Random Intercept LME Model

Open Live Script

Load the sample data and convert it to table format.

load flu
flu = dataset2table(flu);

The flu table has a Date variable, and 10 variables containing estimated influenza rates (in 9 different regions, estimated from Google® searches, plus a nationwide estimate from the Centers for Disease Control and Prevention, CDC).

To fit a linear-mixed effects model, your data must be in a properly formatted table. To fit a linear mixed-effects model with the influenza rates as the responses, combine the nine columns corresponding to the regions into a table variable. The new table, flu2, must have the new response variable FluRate, the nominal variable Region that shows which region each estimate is from, the nationwide estimate WtdILI, and the grouping variable Date.

flu2 = stack(flu,2:10,NewDataVariableName="FluRate", ...
    IndexVariableName="Region");
flu2.Date = categorical(flu2.Date);

Display the first six rows of flu2.

flu2(1:6,:)

ans=6×4 table
      Date       WtdILI     Region      FluRate
    _________    ______    _________    _______

    10/9/2005    1.182     NE             0.97 
    10/9/2005    1.182     MidAtl        1.025 
    10/9/2005    1.182     ENCentral     1.232 
    10/9/2005    1.182     WNCentral     1.286 
    10/9/2005    1.182     SAtl          1.082 
    10/9/2005    1.182     ESCentral     1.457

Fit a linear mixed-effects model with a fixed-effects term for the nationwide estimate, WtdILI, and a random intercept that varies by Date. The model corresponds to

$y_{i m} = β_{0} + β_{1} {W t d I L I}_{i m} + b_{0 m} + ε_{i m}, i = 1, 2, . . ., 468, m = 1, 2, . . ., 52,$

where $y_{i m}$ is the observation $i$ for level $m$ of grouping variable Date, $b_{0 m}$ is the random effect for level $m$ of the grouping variable Date, and $ε_{i m}$ is the observation error for observation $i$ . The random effect has the prior distribution,

$b_{0 m} \sim N (0, σ_{b}^{2}),$

and the error term has the distribution,

$ε_{i m} \sim N (0, σ^{2}) .$

lme = fitlme(flu2,"FluRate ~ 1 + WtdILI + (1|Date)")

lme = 
Linear mixed-effects model fit by ML

Model information:
    Number of observations             468
    Fixed effects coefficients           2
    Random effects coefficients         52
    Covariance parameters                2

Formula:
    FluRate ~ 1 + WtdILI + (1 | Date)

Model fit statistics:
    AIC       BIC       LogLikelihood    Deviance
    286.24    302.83    -139.12          278.24  

Fixed effects coefficients (95% CIs):
    Name                   Estimate    SE          tStat     DF     pValue        Lower       Upper  
    {'(Intercept)'}        0.16385     0.057525    2.8484    466     0.0045885    0.050813    0.27689
    {'WtdILI'     }         0.7236     0.032219    22.459    466    3.0502e-76     0.66028    0.78691

Random effects covariance parameters (95% CIs):
Group: Date (52 Levels)
    Name1                  Name2                  Type           Estimate    Lower      Upper  
    {'(Intercept)'}        {'(Intercept)'}        {'std'}        0.17146     0.13227    0.22226

Group: Error
    Name               Estimate    Lower      Upper  
    {'Res Std'}        0.30201     0.28217    0.32324

Estimated covariance parameters are displayed in the section titled "Random effects covariance parameters". The estimated value of $σ_{b}$ is 0.17146 and its 95% confidence interval is [0.13227, 0.22226]. Since this interval does not include 0, the random-effects term is significant. You can formally test the significance of any random-effects term using a likelihood ratio test via the compare method.

The estimated response at an observation is the sum of the fixed effects and the random-effect value at the grouping variable level corresponding to that observation. For example, the estimated flu rate for observation 28 is

$\begin{array}{l} {y_{}^{ˆ}}_{28} = {β_{}^{ˆ}}_{0} + {β_{}^{ˆ}}_{1} {W t d I L I}_{28} + {b_{}^{ˆ}}_{10 / 30 / 2005} \\ = 0.1639 + 0.7236 * (1.343) + 0.3318 \\ = 1.46749, \end{array}$

where $b_{}^{ˆ}$ is the estimated best linear unbiased predictor (BLUP) of the random effects for the intercept. You can compute this value as follows.

beta = fixedEffects(lme);
[~,~,STATS] = randomEffects(lme); % Compute the random-effects statistics (STATS)
STATS.Level = nominal(STATS.Level);
y_hat = beta(1) + beta(2)*flu2.WtdILI(28) + STATS.Estimate(STATS.Level=='10/30/2005')

y_hat = 
1.4674

You can display the fitted value using the fitted method.

F = fitted(lme);
F(28)

ans = 
1.4674

LME Model for Randomized Block Design

Open Live Script

Load the sample data.

load('shift.mat')

The data shows the absolute deviations from the target quality characteristic measured from the products each of five operators manufacture during three shifts: morning, evening, and night. This is a randomized block design, where the operators are the blocks. The experiment is designed to study the impact of the time of shift on the performance. The performance measure is the absolute deviations of the quality characteristics from the target value. This is simulated data.

Fit a linear mixed-effects model with a random intercept grouped by operator to assess if performance significantly differs according to the time of the shift. Use the restricted maximum likelihood method and 'effects' contrasts.

'effects' contrasts mean that the coefficients sum to 0, and fitlme creates a matrix called a fixed effects design matrix to describe the effect of shift. This matrix has two columns, $S h i f t_E v e n i n g$ and $S h i f t_M o r n i n g$ , where

$Shift_Evening = {\begin{cases} 0, & if Morning \\ 1, & if Evening \\ - 1, & if Night \end{cases}$

$Shift_Morning = {\begin{cases} 1, & if Morning \\ 0, & if Evening \\ - 1, & if Night \end{cases}$

The model corresponds to

$\begin{array}{lrl} Morning Shift: & {QCDev}_{i m} & = β_{0} + β_{2} {Shift_Morning}_{i} + b_{0 m} + ϵ_{i m}, \\ Evening Shift: & {QCDev}_{i m} & = β_{0} + β_{1} {Shift_Evening}_{i} + b_{0 m} + ϵ_{i m}, \\ Night Shift: & {QCDev}_{i m} & = β_{0} - β_{1} {Shift_Evening}_{i} - β_{2} {Shift_Morning}_{i} + b_{0 m} + ϵ_{i m}, \end{array}$

where $i$ represents the observations, and $m$ represents the operators, $i$ = 1, 2, ..., 15, and $m$ = 1, 2, ..., 5. The random effects and the observation error have the following distributions:

$b_{0 m} \sim N (0, σ_{b}^{2})$

and

$ε_{i m} \sim N (0, σ^{2}) .$

lme = fitlme(shift,'QCDev ~ Shift + (1|Operator)',...
'FitMethod','REML','DummyVarCoding','effects')

lme = 
Linear mixed-effects model fit by REML

Model information:
    Number of observations              15
    Fixed effects coefficients           3
    Random effects coefficients          5
    Covariance parameters                2

Formula:
    QCDev ~ 1 + Shift + (1 | Operator)

Model fit statistics:
    AIC       BIC       LogLikelihood    Deviance
    58.913    61.337    -24.456          48.913  

Fixed effects coefficients (95% CIs):
    Name                     Estimate    SE         tStat      DF    pValue       Lower      Upper   
    {'(Intercept)'  }          3.6525    0.94109     3.8812    12    0.0021832     1.6021       5.703
    {'Shift_Evening'}        -0.53293    0.31206    -1.7078    12      0.11339    -1.2129     0.14699
    {'Shift_Morning'}        -0.91973    0.31206    -2.9473    12     0.012206    -1.5997    -0.23981

Random effects covariance parameters (95% CIs):
Group: Operator (5 Levels)
    Name1                  Name2                  Type           Estimate    Lower      Upper 
    {'(Intercept)'}        {'(Intercept)'}        {'std'}        2.0457      0.98207    4.2612

Group: Error
    Name               Estimate    Lower      Upper
    {'Res Std'}        0.85462     0.52357    1.395

Compute the best linear unbiased predictor (BLUP) estimates of random effects.

B = randomEffects(lme)

The estimated absolute deviation from the target quality characteristics for the third operator working the evening shift is

$\begin{array}{l} {y_{}^{ˆ}}_{Evening, Operator 3} = {β_{}^{ˆ}}_{0} + {β_{}^{ˆ}}_{1} Shift_Evening + {b_{}^{ˆ}}_{03} \\ = 3.6525 - 0.53293 - 2.1715 \\ = 0.94807 . \end{array}$

You can also display this value as follows.

F = fitted(lme);
F(shift.Shift=='Evening' & shift.Operator=='3')

ans = 
0.9481

Similarly, you can calculate the estimated absolute deviation from the target quality characteristics for the third operator working the morning shift as

$\begin{array}{l} {y_{}^{ˆ}}_{Morning, Operator 3} = {β_{}^{ˆ}}_{0} + {β_{}^{ˆ}}_{2} Shift_Morning + {b_{}^{ˆ}}_{03} \\ = 3.6525 - 0.91973 - 2.1715 \\ = 0.56127 . \end{array}$

You can also display this value as follows.

F(shift.Shift=='Morning' & shift.Operator=='3')

ans = 
0.5613

The operator tends to make a smaller magnitude of error during the morning shift.

LME Model for Split-Plot Experiment

Open Live Script

Load and display the sample data.

load fertilizer.mat
tbl

tbl=60×4 table
      Soil          Tomato       Fertilizer    Yield
    _________    ____________    __________    _____

    {'Sandy'}    {'Plum'    }        1          104 
    {'Sandy'}    {'Plum'    }        2          136 
    {'Sandy'}    {'Plum'    }        3          158 
    {'Sandy'}    {'Plum'    }        4          174 
    {'Sandy'}    {'Cherry'  }        1           57 
    {'Sandy'}    {'Cherry'  }        2           86 
    {'Sandy'}    {'Cherry'  }        3           89 
    {'Sandy'}    {'Cherry'  }        4           98 
    {'Sandy'}    {'Heirloom'}        1           65 
    {'Sandy'}    {'Heirloom'}        2           62 
    {'Sandy'}    {'Heirloom'}        3          113 
    {'Sandy'}    {'Heirloom'}        4           84 
    {'Sandy'}    {'Grape'   }        1           54 
    {'Sandy'}    {'Grape'   }        2           86 
    {'Sandy'}    {'Grape'   }        3           89 
    {'Sandy'}    {'Grape'   }        4          115 
      ⋮

The table tbl includes data from a split-plot experiment, where soil is divided into three blocks based on the soil type: sandy, silty, and loamy. Each block is divided into five plots, where five types of tomato plants (cherry, heirloom, grape, vine, and plum) are randomly assigned to these plots. The tomato plants in the plots are then divided into subplots, where each subplot is treated by one of four fertilizers. This is simulated data.

Convert the variables Tomato, Soil, and Fertilizer to categorical variables.

tbl.Tomato = categorical(tbl.Tomato);
tbl.Soil = categorical(tbl.Soil);
tbl.Fertilizer = categorical(tbl.Fertilizer);

Fit a linear mixed-effects model, where Fertilizer and Tomato are the fixed-effects variables, and the mean yield varies by the block (soil type) and the plots within blocks (tomato types within soil types) independently.

This model corresponds to

$\begin{array}{l} y_{i m j k} = β_{0} + \sum_{m = 2}^{4} β_{1 m} I {[F]}_{i m} + \sum_{j = 2}^{5} β_{2 j} I {[T]}_{i j} + \sum_{j = 2}^{5} \sum_{m = 2}^{4} β_{3 m j} I {[F]}_{i m} I {[T]}_{i j} \\ + b_{0 k} S_{k} + b_{0 j k} (S * T)_{j k} + ε_{i m j k}, \end{array}$

where $i$ = 1, 2, ..., 60, index $m$ corresponds to the fertilizer types, $j$ corresponds to the tomato types, and $k$ = 1, 2, 3 corresponds to the blocks (soil). $S_{k}$ represents the $k$ th soil type, and $(S * T)_{j k}$ represents the $j$ th tomato type nested in the $k$ th soil type. $I [F]_{i m}$ is the dummy variable representing level $m$ of the fertilizer. Similarly, $I [T]_{i j}$ is the dummy variable representing level $j$ of the tomato type.

The random effects and observation error have these prior distributions: $b_{0 k}$ ~N(0, $σ_{S}^{2}$ ), $b_{0 j k}$ ~N(0, $σ_{S * T}^{2}$ ), and $ϵ_{i m j k}$ ~ N(0, $σ^{2}$ ).

lme = fitlme(tbl,"Yield ~ Fertilizer * Tomato + (1|Soil) + (1|Soil:Tomato)")

lme = 
Linear mixed-effects model fit by ML

Model information:
    Number of observations              60
    Fixed effects coefficients          20
    Random effects coefficients         18
    Covariance parameters                3

Formula:
    Yield ~ 1 + Tomato*Fertilizer + (1 | Soil) + (1 | Soil:Tomato)

Model fit statistics:
    AIC       BIC       LogLikelihood    Deviance
    522.57    570.74    -238.29          476.57  

Fixed effects coefficients (95% CIs):
    Name                                    Estimate    SE        tStat       DF    pValue        Lower      Upper 
    {'(Intercept)'                 }             77     8.5836      8.9706    40    4.0206e-11     59.652    94.348
    {'Tomato_Grape'                }            -16     11.966     -1.3371    40       0.18873    -40.184    8.1837
    {'Tomato_Heirloom'             }        -6.6667     11.966    -0.55714    40       0.58053     -30.85    17.517
    {'Tomato_Plum'                 }         32.333     11.966      2.7022    40      0.010059     8.1496    56.517
    {'Tomato_Vine'                 }            -13     11.966     -1.0864    40       0.28379    -37.184    11.184
    {'Fertilizer_2'                }         34.667      8.572      4.0442    40    0.00023272     17.342    51.991
    {'Fertilizer_3'                }         33.667      8.572      3.9275    40    0.00033057     16.342    50.991
    {'Fertilizer_4'                }         47.667      8.572      5.5607    40    1.9567e-06     30.342    64.991
    {'Tomato_Grape:Fertilizer_2'   }        -2.6667     12.123    -0.21997    40       0.82701    -27.167    21.834
    {'Tomato_Heirloom:Fertilizer_2'}             -8     12.123    -0.65992    40       0.51309    -32.501    16.501
    {'Tomato_Plum:Fertilizer_2'    }            -15     12.123     -1.2374    40       0.22317    -39.501    9.5007
    {'Tomato_Vine:Fertilizer_2'    }            -16     12.123     -1.3198    40       0.19439    -40.501    8.5007
    {'Tomato_Grape:Fertilizer_3'   }         16.667     12.123      1.3748    40       0.17683    -7.8341    41.167
    {'Tomato_Heirloom:Fertilizer_3'}         3.3333     12.123     0.27497    40       0.78476    -21.167    27.834
    {'Tomato_Plum:Fertilizer_3'    }         3.6667     12.123     0.30246    40       0.76387    -20.834    28.167
    {'Tomato_Vine:Fertilizer_3'    }              3     12.123     0.24747    40       0.80581    -21.501    27.501
    {'Tomato_Grape:Fertilizer_4'   }         13.333     12.123      1.0999    40       0.27796    -11.167    37.834
    {'Tomato_Heirloom:Fertilizer_4'}            -19     12.123     -1.5673    40       0.12492    -43.501    5.5007
    {'Tomato_Plum:Fertilizer_4'    }        -2.6667     12.123    -0.21997    40       0.82701    -27.167    21.834
    {'Tomato_Vine:Fertilizer_4'    }         8.6667     12.123     0.71492    40       0.47881    -15.834    33.167

Random effects covariance parameters (95% CIs):
Group: Soil (3 Levels)
    Name1                  Name2                  Type           Estimate    Lower       Upper 
    {'(Intercept)'}        {'(Intercept)'}        {'std'}        2.5028      0.027711    226.04

Group: Soil:Tomato (15 Levels)
    Name1                  Name2                  Type           Estimate    Lower     Upper 
    {'(Intercept)'}        {'(Intercept)'}        {'std'}        10.225      6.1497    17.001

Group: Error
    Name               Estimate    Lower     Upper 
    {'Res Std'}        10.499      8.5389    12.908

The $p$ -values corresponding to the last 12 rows in the fixed-effects coefficients display (0.82701 to 0.47881) indicate that interaction coefficients between the tomato and fertilizer types are not significant. To test for the overall interaction between tomato and fertilizer, use the anova method after refitting the model using effects contrasts.

The confidence interval for the standard deviations of the random-effects terms ( $σ_{S}^{2}$ ), where the intercept is grouped by soil, is very large. This term does not appear significant.

Refit the model after removing the interaction term Tomato:Fertilizer and the random-effects term (1 | Soil).

lme = fitlme(tbl,"Yield ~ Fertilizer + Tomato + (1|Soil:Tomato)")

lme = 
Linear mixed-effects model fit by ML

Model information:
    Number of observations              60
    Fixed effects coefficients           8
    Random effects coefficients         15
    Covariance parameters                2

Formula:
    Yield ~ 1 + Tomato + Fertilizer + (1 | Soil:Tomato)

Model fit statistics:
    AIC       BIC    LogLikelihood    Deviance
    511.06    532    -245.53          491.06  

Fixed effects coefficients (95% CIs):
    Name                       Estimate    SE        tStat       DF    pValue        Lower      Upper 
    {'(Intercept)'    }         77.733     7.3293      10.606    52    1.3108e-14     63.026    92.441
    {'Tomato_Grape'   }        -9.1667     9.6045    -0.95441    52       0.34429    -28.439    10.106
    {'Tomato_Heirloom'}        -12.583     9.6045     -1.3102    52        0.1959    -31.856    6.6895
    {'Tomato_Plum'    }         28.833     9.6045      3.0021    52     0.0041138     9.5605    48.106
    {'Tomato_Vine'    }        -14.083     9.6045     -1.4663    52       0.14858    -33.356    5.1895
    {'Fertilizer_2'   }         26.333     4.5004      5.8514    52    3.3024e-07     17.303    35.364
    {'Fertilizer_3'   }             39     4.5004      8.6659    52    1.1459e-11     29.969    48.031
    {'Fertilizer_4'   }         47.733     4.5004      10.607    52     1.308e-14     38.703    56.764

Random effects covariance parameters (95% CIs):
Group: Soil:Tomato (15 Levels)
    Name1                  Name2                  Type           Estimate    Lower     Upper 
    {'(Intercept)'}        {'(Intercept)'}        {'std'}        10.02       6.0812    16.509

Group: Error
    Name               Estimate    Lower     Upper 
    {'Res Std'}        12.325      10.024    15.153

You can compare the two models using the compare method with the simulated likelihood ratio test since both a fixed-effect and a random-effect term are tested.

Longitudinal Study with a Covariate

Open Live Script

Load the sample data.

load('weight.mat')

weight contains data from a longitudinal study, where 20 subjects are randomly assigned to 4 exercise programs (A, B, C, D), and their weight loss is recorded over six 2-week time periods. This is simulated data.

Store the data in a table. Define Subject and Program as categorical variables.

tbl = table(InitialWeight,Program,Subject,Week,y);
tbl.Subject = nominal(tbl.Subject);
tbl.Program = nominal(tbl.Program);

Fit a linear mixed-effects model where the initial weight, type of program, week, and the interaction between the week and type of program are the fixed effects. The intercept and week vary by subject.

fitlme uses program A as a reference and creates the necessary dummy variables $I$ [.]. Since the model already has an intercept, fitlme only creates dummy variables for programs B, C, and D. This is also known as the 'reference' method of coding dummy variables.

This model corresponds to

$\begin{array}{l} y_{i m} = β_{0} + β_{1} I W_{i} + β_{2} W e e k_{i} + β_{3} I {[P B]}_{i} + β_{4} I {[P C]}_{i} + β_{5} I {[P D]}_{i} \\ + β_{6} (W e e k_{i} * I {[P B]}_{i}) + β_{7} (W e e k_{i} * I {[P C]}_{i}) + β_{8} (W e e k_{i} * I {[P D]}_{i}) \\ + b_{0 m} + b_{1 m} W e e k_{i m} + ε_{i m}, \end{array}$

where $i$ = 1, 2, ..., 120, and $m$ = 1, 2, ..., 20. $β_{j}$ are the fixed-effects coefficients, $j$ = 0, 1, ..., 8, and $b_{0 m}$ and $b_{1 m}$ are random effects. $I W$ stands for initial weight and $I [\cdot]$ is a dummy variable representing a type of program. For example, $I [P B]_{i}$ is the dummy variable representing program type B. The random effects and observation error have the following prior distributions:

$b_{0 m} \sim N (0, σ_{0}^{2})$

$b_{1 m} \sim N (0, σ_{1}^{2})$

$ε_{i m} \sim N (0, σ^{2}) .$

lme = fitlme(tbl,'y ~ InitialWeight + Program*Week + (Week|Subject)')

lme = 
Linear mixed-effects model fit by ML

Model information:
    Number of observations             120
    Fixed effects coefficients           9
    Random effects coefficients         40
    Covariance parameters                4

Formula:
    y ~ 1 + InitialWeight + Program*Week + (1 + Week | Subject)

Model fit statistics:
    AIC        BIC       LogLikelihood    Deviance
    -22.981    13.257    24.49            -48.981 

Fixed effects coefficients (95% CIs):
    Name                      Estimate     SE           tStat       DF     pValue       Lower         Upper    
    {'(Intercept)'   }          0.66105      0.25892      2.5531    111     0.012034       0.14798       1.1741
    {'InitialWeight' }        0.0031879    0.0013814      2.3078    111     0.022863    0.00045067    0.0059252
    {'Program_B'     }          0.36079      0.13139       2.746    111    0.0070394       0.10044      0.62113
    {'Program_C'     }        -0.033263      0.13117    -0.25358    111      0.80029      -0.29319      0.22666
    {'Program_D'     }          0.11317      0.13132     0.86175    111      0.39068      -0.14706       0.3734
    {'Week'          }           0.1732     0.067454      2.5677    111     0.011567      0.039536      0.30686
    {'Program_B:Week'}         0.038771     0.095394     0.40644    111      0.68521      -0.15026       0.2278
    {'Program_C:Week'}         0.030543     0.095394     0.32018    111      0.74944      -0.15849      0.21957
    {'Program_D:Week'}         0.033114     0.095394     0.34713    111      0.72915      -0.15592      0.22214

Random effects covariance parameters (95% CIs):
Group: Subject (20 Levels)
    Name1                  Name2                  Type            Estimate    Lower      Upper  
    {'(Intercept)'}        {'(Intercept)'}        {'std' }        0.18407     0.12281    0.27587
    {'Week'       }        {'(Intercept)'}        {'corr'}        0.66841     0.21076    0.88573
    {'Week'       }        {'Week'       }        {'std' }        0.15033     0.11004    0.20537

Group: Error
    Name               Estimate    Lower       Upper  
    {'Res Std'}        0.10261     0.087882    0.11981

The $p$ -values 0.022863 and 0.011567 indicate significant effects of subject initial weights and time in the amount of weight lost. The weight loss of subjects who are in program B is significantly different relative to the weight loss of subjects who are in program A. The lower and upper limits of the covariance parameters for the random effects do not include 0, thus they are significant. You can also test the significance of the random effects using the compare method.

Input Arguments

collapse all

`tbl` — Input data
table | `dataset` array

Input data, which includes the response variable, predictor variables, and grouping variables, specified as a table or dataset array. The predictor variables can be continuous or grouping variables (see Grouping Variables). The response variable must be numeric. You must specify the model for the variables using formula.

Data Types: table

`formula` — Formula for model specification
character vector or string scalar of the form `'y ~ fixed + (random1|grouping1) + ... + (randomR|groupingR)'`

Formula for model specification, specified as a character vector or string scalar of the form 'y ~ fixed + (random1|grouping1) + ... + (randomR|groupingR)'. The formula is case sensitive. For a full description, see Formula.

Example: 'y ~ treatment + (1|block)'

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: 'CovariancePattern','Diagonal','Optimizer','fminunc','OptimizerOptions',opt specifies a model, where the random-effects terms have a diagonal covariance matrix structure, and fitlme uses the fminunc optimization algorithm with the custom optimization parameters defined in variable opt.

`CovariancePattern` — Pattern of covariance matrix
`'FullCholesky'` (default) | character vector | string scalar | square symmetric logical matrix | string array | cell array of character vectors or logical matrices

Pattern of the covariance matrix of the random effects, specified as the comma-separated pair consisting of 'CovariancePattern' and a character vector, a string scalar, a square symmetric logical matrix, a string array, or a cell array of character vectors or logical matrices.

If there are R random-effects terms, then the value of 'CovariancePattern' must be a string array or cell array of length R, where each element r of the array specifies the pattern of the covariance matrix of the random-effects vector associated with the rth random-effects term. The options for each element follow.

`'FullCholesky'`	Default. Full covariance matrix using the Cholesky parameterization. `fitlme` estimates all elements of the covariance matrix.
`'Full'`	Full covariance matrix, using the log-Cholesky parameterization. `fitlme` estimates all elements of the covariance matrix.
`'Diagonal'`	Diagonal covariance matrix. That is, off-diagonal elements of the covariance matrix are constrained to be 0. $(\begin{matrix} σ_{b 1}^{2} & 0 & 0 \\ 0 & σ_{b 2}^{2} & 0 \\ 0 & 0 & σ_{b 3}^{2} \end{matrix})$
`'Isotropic'`	Diagonal covariance matrix with equal variances. That is, off-diagonal elements of the covariance matrix are constrained to be 0, and the diagonal elements are constrained to be equal. For example, if there are three random-effects terms with an isotropic covariance structure, this covariance matrix looks like $(\begin{matrix} σ_{b}^{2} & 0 & 0 \\ 0 & σ_{b}^{2} & 0 \\ 0 & 0 & σ_{b}^{2} \end{matrix})$ where σ²_b is the common variance of the random-effects terms.
`'CompSymm'`	Compound symmetry structure. That is, common variance along diagonals and equal correlation between all random effects. For example, if there are three random-effects terms with a covariance matrix having a compound symmetry structure, this covariance matrix looks like $(\begin{matrix} σ_{b 1}^{2} & σ_{b 1, b 2} & σ_{b 1, b 2} \\ σ_{b 1, b 2} & σ_{b 1}^{2} & σ_{b 1, b 2} \\ σ_{b 1, b 2} & σ_{b 1, b 2} & σ_{b 1}^{2} \end{matrix})$ where σ²_b1 is the common variance of the random-effects terms and σ_b1,b2 is the common covariance between any two random-effects term.
`PAT`	Square symmetric logical matrix. If `'CovariancePattern'` is defined by the matrix `PAT`, and if `PAT(a,b) = false`, then the `(a,b)` element of the corresponding covariance matrix is constrained to be 0.

Example: 'CovariancePattern','Diagonal'

Example: 'CovariancePattern',{'Full','Diagonal'}

Data Types: char | string | logical | cell

`FitMethod` — Method for estimating parameters
`'ML'` (default) | `'REML'`

Method for estimating parameters of the linear mixed-effects model, specified as the comma-separated pair consisting of 'FitMethod' and either of the following.

`'ML'`	Default. Maximum likelihood estimation
`'REML'`	Restricted maximum likelihood estimation

Example: 'FitMethod','REML'

`Weights` — Observation weights
vector of scalar values

Observation weights, specified as the comma-separated pair consisting of 'Weights' and a vector of length n, where n is the number of observations.

Data Types: single | double

`Exclude` — Indices for rows to exclude
use all rows without `NaNs` (default) | vector of integer or logical values

Indices for rows to exclude from the linear mixed-effects model in the data, specified as the comma-separated pair consisting of 'Exclude' and a vector of integer or logical values.

For example, you can exclude the 13th and 67th rows from the fit as follows.

Example: 'Exclude',[13,67]

Data Types: single | double | logical

`DummyVarCoding` — Coding to use for dummy variables
`'reference'` (default) | `'effects'` | `'full'`

Coding to use for dummy variables created from the categorical variables, specified as the comma-separated pair consisting of 'DummyVarCoding' and one of the variables in this table.

Value	Description
`'reference'` (default)	`fitlme` creates dummy variables with a reference group. This scheme treats the first category as a reference group and creates one less dummy variables than the number of categories. You can check the category order of a categorical variable by using the `categories` function, and change the order by using the `reordercats` function.
`'effects'`	`fitlme` creates dummy variables using effects coding. This scheme uses –1 to represent the last category. This scheme creates one less dummy variables than the number of categories.
`'full'`	`fitlme` creates full dummy variables. This scheme creates one dummy variable for each category.

For more details about creating dummy variables, see Automatic Creation of Dummy Variables.

Example: 'DummyVarCoding','effects'

`Optimizer` — Optimization algorithm
`'quasinewton'` (default) | `'fminunc'`

Optimization algorithm, specified as the comma-separated pair consisting of 'Optimizer' and either of the following.

`'quasinewton'`	Default. Uses a trust region based quasi-Newton optimizer. Change the options of the algorithm using `statset('LinearMixedModel')`. If you don’t specify the options, then `LinearMixedModel` uses the default options of `statset('LinearMixedModel')`.
`'fminunc'`	You must have Optimization Toolbox™ to specify this option. Change the options of the algorithm using `optimoptions('fminunc')`. If you don’t specify the options, then `LinearMixedModel` uses the default options of `optimoptions('fminunc')` with `'Algorithm'` set to `'quasi-newton'`.

Example: 'Optimizer','fminunc'

`OptimizerOptions` — Options for optimization algorithm
structure returned by `statset` | object returned by `optimoptions`

Options for the optimization algorithm, specified as the comma-separated pair consisting of 'OptimizerOptions' and a structure returned by statset('LinearMixedModel') or an object returned by optimoptions('fminunc').

If 'Optimizer' is 'fminunc', then use optimoptions('fminunc') to change the options of the optimization algorithm. See optimoptions for the options 'fminunc' uses. If 'Optimizer' is 'fminunc' and you do not supply 'OptimizerOptions', then the default for LinearMixedModel is the default options created by optimoptions('fminunc') with 'Algorithm' set to 'quasi-newton'.
If 'Optimizer' is 'quasinewton', then use statset('LinearMixedModel') to change the optimization parameters. If you don’t change the optimization parameters, then LinearMixedModel uses the default options created by statset('LinearMixedModel'):

The 'quasinewton' optimizer uses the following fields in the structure created by statset('LinearMixedModel').

`TolFun` — Relative tolerance on gradient of objective function
`1e-6` (default) | positive scalar value

Relative tolerance on the gradient of the objective function, specified as a positive scalar value.

`TolX` — Absolute tolerance on step size
`1e-12` (default) | positive scalar value

Absolute tolerance on the step size, specified as a positive scalar value.

`MaxIter` — Maximum number of iterations allowed
`10000` (default) | positive scalar value

Maximum number of iterations allowed, specified as a positive scalar value.

`Display` — Level of display
`'off'` (default) | `'iter'` | `'final'`

Level of display, specified as one of 'off', 'iter', or 'final'.

`StartMethod` — Method to start iterative optimization
`'default'` (default) | `'random'`

Method to start iterative optimization, specified as the comma-separated pair consisting of 'StartMethod' and either of the following.

Value	Description
`'default'`	An internally defined default value
`'random'`	A random initial value

Example: 'StartMethod','random'

`Verbose` — Indicator to display optimization process on screen
`false` (default) | `true`

Indicator to display the optimization process on screen, specified as the comma-separated pair consisting of 'Verbose' and either false or true. Default is false.

The setting for 'Verbose' overrides the field 'Display' in 'OptimizerOptions'.

Example: 'Verbose',true

`CheckHessian` — Indicator to check positive definiteness of Hessian
`false` (default) | `true`

Indicator to check the positive definiteness of the Hessian of the objective function with respect to unconstrained parameters at convergence, specified as the comma-separated pair consisting of 'CheckHessian' and either false or true. Default is false.

Specify 'CheckHessian' as true to verify optimality of the solution or to determine if the model is overparameterized in the number of covariance parameters.

Example: 'CheckHessian',true

Output Arguments

collapse all

`lme` — Linear mixed-effects model
`LinearMixedModel` object

Linear mixed-effects model, returned as a LinearMixedModel object.

More About

collapse all

Formula

In general, a formula for model specification is a character vector or string scalar of the form 'y ~ terms'. For the linear mixed-effects models, this formula is in the form 'y ~ fixed + (random1|grouping1) + ... + (randomR|groupingR)', where fixed and random contain the fixed-effects and the random-effects terms.

Suppose a table tbl contains the following:

A response variable, y
Predictor variables, X_j, which can be continuous or grouping variables
Grouping variables, g₁, g₂, ..., g_R,

where the grouping variables in X_j and g_r can be categorical, logical, character arrays, string arrays, or cell arrays of character vectors.

Then, in a formula of the form, 'y ~ fixed + (random₁|g₁) + ... + (random_R|g_R)', the term fixed corresponds to a specification of the fixed-effects design matrix X, random₁ is a specification of the random-effects design matrix Z₁ corresponding to grouping variable g₁, and similarly random_R is a specification of the random-effects design matrix Z_R corresponding to grouping variable g_R. You can express the fixed and random terms using Wilkinson notation.

Wilkinson notation describes the factors present in models. The notation relates to factors present in models, not to the multipliers (coefficients) of those factors.

Wilkinson Notation	Factors in Standard Notation
`1`	Constant (intercept) term
`X^k`, where `k` is a positive integer	`X`, `X²`, ..., `X^k`
`X1 + X2`	`X1`, `X2`
`X1*X2`	`X1`, `X2`, `X1.*X2 (elementwise multiplication of X1 and X2)`
`X1:X2`	`X1.*X2` only
`- X2`	Do not include `X2`
`X1*X2 + X3`	`X1`, `X2`, `X3`, `X1*X2`
`X1 + X2 + X3 + X1:X2`	`X1`, `X2`, `X3`, `X1*X2`
`X1X2X3 - X1:X2:X3`	`X1`, `X2`, `X3`, `X1X2`, `X1X3`, `X2*X3`
`X1*(X2 + X3)`	`X1`, `X2`, `X3`, `X1X2`, `X1X3`

Statistics and Machine Learning Toolbox™ notation always includes a constant term unless you explicitly remove the term using -1. Here are some examples for linear mixed-effects model specification.

Examples:

Formula	Description
`'y ~ X1 + X2'`	Fixed effects for the intercept, `X1` and `X2`. This is equivalent to `'y ~ 1 + X1 + X2'`.
`'y ~ -1 + X1 + X2'`	No intercept and fixed effects for `X1` and `X2`. The implicit intercept term is suppressed by including `-1`.
`'y ~ 1 + (1 \| g1)'`	Fixed effects for the intercept plus random effect for the intercept for each level of the grouping variable `g1`.
`'y ~ X1 + (1 \| g1)'`	Random intercept model with a fixed slope.
`'y ~ X1 + (X1 \| g1)'`	Random intercept and slope, with possible correlation between them. This is equivalent to `'y ~ 1 + X1 + (1 + X1\|g1)'`.
`'y ~ X1 + (1 \| g1) + (-1 + X1 \| g1)'`	Independent random effects terms for intercept and slope.
`'y ~ 1 + (1 \| g1) + (1 \| g2) + (1 \| g1:g2)'`	Random intercept model with independent main effects for `g1` and `g2`, plus an independent interaction effect.

Cholesky Parameterization

One of the assumptions of linear mixed-effects models is that the random effects have the following prior distribution.

$b ~ N (0, σ^{2} D (θ)),$

where D is a q-by-q symmetric and positive semidefinite matrix, parameterized by a variance component vector θ, q is the number of variables in the random-effects term, and σ² is the observation error variance. Since the covariance matrix of the random effects, D, is symmetric, it has q(q+1)/2 free parameters. Suppose L is the lower triangular Cholesky factor of D(θ) such that

$D (θ) = L (θ) L {(θ)}^{T},$

then the q*(q+1)/2-by-1 unconstrained parameter vector θ is formed from elements in the lower triangular part of L.

For example, if

$L = [\begin{matrix} L_{11} & 0 & 0 \\ L_{21} & L_{22} & 0 \\ L_{31} & L_{32} & L_{33} \end{matrix}],$

then

$θ = [\begin{matrix} L_{11} \\ L_{21} \\ L_{31} \\ L_{22} \\ L_{32} \\ L_{33} \end{matrix}] .$

Log-Cholesky Parameterization

When the diagonal elements of L in Cholesky parameterization are constrained to be positive, then the solution for L is unique. Log-Cholesky parameterization is the same as Cholesky parameterization except that the logarithm of the diagonal elements of L are used to guarantee unique parameterization.

For example, for the 3-by-3 example in Cholesky parameterization, enforcing L_ii ≥ 0,

$θ = [\begin{matrix} \log (L_{11}) \\ L_{21} \\ L_{31} \\ \log (L_{22}) \\ L_{32} \\ \log (L_{33}) \end{matrix}] .$

Alternatives

If your model is not easily described using a formula, you can create matrices to define the fixed and random effects, and fit the model using fitlmematrix(X,y,Z,G).

References

[1] Pinherio, J. C., and D. M. Bates. “Unconstrained Parametrizations for Variance-Covariance Matrices”. Statistics and Computing, Vol. 6, 1996, pp. 289–296.

Version History

Introduced in R2013b

fitlme

Syntax

Description

Examples

Fit Linear Mixed-Effects Model

Fit Random Intercept LME Model

LME Model for Randomized Block Design

LME Model for Split-Plot Experiment

Longitudinal Study with a Covariate

Input Arguments

`tbl` — Input data
table | `dataset` array

`formula` — Formula for model specification
character vector or string scalar of the form `'y ~ fixed + (random1|grouping1) + ... + (randomR|groupingR)'`

Name-Value Arguments

`CovariancePattern` — Pattern of covariance matrix
`'FullCholesky'` (default) | character vector | string scalar | square symmetric logical matrix | string array | cell array of character vectors or logical matrices

`FitMethod` — Method for estimating parameters
`'ML'` (default) | `'REML'`

`Weights` — Observation weights
vector of scalar values

`Exclude` — Indices for rows to exclude
use all rows without `NaNs` (default) | vector of integer or logical values

`DummyVarCoding` — Coding to use for dummy variables
`'reference'` (default) | `'effects'` | `'full'`

`Optimizer` — Optimization algorithm
`'quasinewton'` (default) | `'fminunc'`

`OptimizerOptions` — Options for optimization algorithm
structure returned by `statset` | object returned by `optimoptions`

`TolFun` — Relative tolerance on gradient of objective function
`1e-6` (default) | positive scalar value

`TolX` — Absolute tolerance on step size
`1e-12` (default) | positive scalar value

`MaxIter` — Maximum number of iterations allowed
`10000` (default) | positive scalar value

`Display` — Level of display
`'off'` (default) | `'iter'` | `'final'`

`StartMethod` — Method to start iterative optimization
`'default'` (default) | `'random'`

`Verbose` — Indicator to display optimization process on screen
`false` (default) | `true`

`CheckHessian` — Indicator to check positive definiteness of Hessian
`false` (default) | `true`

Output Arguments

`lme` — Linear mixed-effects model
`LinearMixedModel` object

More About

Formula

Cholesky Parameterization

Log-Cholesky Parameterization

Alternatives

References

Version History

See Also

Topics

fitlme

Syntax

Description

Examples

Fit Linear Mixed-Effects Model

Fit Random Intercept LME Model

LME Model for Randomized Block Design

LME Model for Split-Plot Experiment

Longitudinal Study with a Covariate

Input Arguments

tbl — Input data table | dataset array

formula — Formula for model specification character vector or string scalar of the form 'y ~ fixed + (random1|grouping1) + ... + (randomR|groupingR)'

Name-Value Arguments

CovariancePattern — Pattern of covariance matrix 'FullCholesky' (default) | character vector | string scalar | square symmetric logical matrix | string array | cell array of character vectors or logical matrices

FitMethod — Method for estimating parameters 'ML' (default) | 'REML'

Weights — Observation weights vector of scalar values

Exclude — Indices for rows to exclude use all rows without NaNs (default) | vector of integer or logical values

DummyVarCoding — Coding to use for dummy variables 'reference' (default) | 'effects' | 'full'

Optimizer — Optimization algorithm 'quasinewton' (default) | 'fminunc'

OptimizerOptions — Options for optimization algorithm structure returned by statset | object returned by optimoptions

TolFun — Relative tolerance on gradient of objective function 1e-6 (default) | positive scalar value

TolX — Absolute tolerance on step size 1e-12 (default) | positive scalar value

MaxIter — Maximum number of iterations allowed 10000 (default) | positive scalar value

Display — Level of display 'off' (default) | 'iter' | 'final'

StartMethod — Method to start iterative optimization 'default' (default) | 'random'

Verbose — Indicator to display optimization process on screen false (default) | true

CheckHessian — Indicator to check positive definiteness of Hessian false (default) | true

Output Arguments

lme — Linear mixed-effects model LinearMixedModel object

More About

Formula

Cholesky Parameterization

Log-Cholesky Parameterization

Alternatives

References

Version History

See Also

Topics

`tbl` — Input data
table | `dataset` array

`formula` — Formula for model specification
character vector or string scalar of the form `'y ~ fixed + (random1|grouping1) + ... + (randomR|groupingR)'`

`CovariancePattern` — Pattern of covariance matrix
`'FullCholesky'` (default) | character vector | string scalar | square symmetric logical matrix | string array | cell array of character vectors or logical matrices

`FitMethod` — Method for estimating parameters
`'ML'` (default) | `'REML'`

`Weights` — Observation weights
vector of scalar values

`Exclude` — Indices for rows to exclude
use all rows without `NaNs` (default) | vector of integer or logical values

`DummyVarCoding` — Coding to use for dummy variables
`'reference'` (default) | `'effects'` | `'full'`

`Optimizer` — Optimization algorithm
`'quasinewton'` (default) | `'fminunc'`

`OptimizerOptions` — Options for optimization algorithm
structure returned by `statset` | object returned by `optimoptions`

`TolFun` — Relative tolerance on gradient of objective function
`1e-6` (default) | positive scalar value

`TolX` — Absolute tolerance on step size
`1e-12` (default) | positive scalar value

`MaxIter` — Maximum number of iterations allowed
`10000` (default) | positive scalar value

`Display` — Level of display
`'off'` (default) | `'iter'` | `'final'`

`StartMethod` — Method to start iterative optimization
`'default'` (default) | `'random'`

`Verbose` — Indicator to display optimization process on screen
`false` (default) | `true`

`CheckHessian` — Indicator to check positive definiteness of Hessian
`false` (default) | `true`

`lme` — Linear mixed-effects model
`LinearMixedModel` object