Main Content

random

Generate random responses from fitted multinomial regression model

Since R2023a

    Description

    example

    Ysim = random(mdl,XNew) returns a vector of responses randomly sampled from the multinomial distributions generated by passing the predictor data XNew to the fitted multinomial regression model object mdl.

    Ysim = random(mdl,XNew,NumTrials) specifies the number of randomly sampled responses to return for each data point in XNew.

    example

    Ysim = random(___,type) specifies whether to return the simulated responses as category labels or a table of counts, using any of the input argument combinations in previous syntaxes.

    Examples

    collapse all

    Load the fisheriris sample data set.

    load fisheriris

    The column vector species contains three iris flower species: setosa, versicolor, and virginica. The matrix meas contains four types of measurements for the flowers: the length and width of sepals and petals in centimeters.

    Fit a multinomial regression model using the measurements as the predictor data and the iris species as the response data.

    mdl = fitmnr(meas,species);

    mdl is a multinomial regression model object that contains the results of fitting a nominal multinomial regression model to the data.

    Use the rand function to generate new predictor data by adding uniformly distributed random noise to the measurements data.

    rng("default")  % Set the random seed for reproducibility
    sz = size(meas);
    measNew = meas + rand(sz);

    Each row of measNew corresponds to predictor data for a unique data point.

    Generate random response values sampled from the multinomial distributions generated by passing measNew to the fitted model.

    speciesPred = random(mdl,measNew)
    speciesPred = 150x1 cell
        {'versicolor'}
        {'setosa'    }
        {'setosa'    }
        {'setosa'    }
        {'setosa'    }
        {'versicolor'}
        {'setosa'    }
        {'versicolor'}
        {'setosa'    }
        {'setosa'    }
        {'setosa'    }
        {'setosa'    }
        {'setosa'    }
        {'setosa'    }
        {'setosa'    }
        {'versicolor'}
        {'setosa'    }
        {'versicolor'}
        {'setosa'    }
        {'versicolor'}
        {'setosa'    }
        {'versicolor'}
        {'setosa'    }
        {'setosa'    }
        {'setosa'    }
        {'setosa'    }
        {'versicolor'}
        {'setosa'    }
        {'setosa'    }
        {'versicolor'}
          ⋮
    
    

    Display the original response data.

    species
    species = 150x1 cell
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
        {'setosa'}
          ⋮
    
    

    The outputs for speciesPred and species show that the majority of the randomly generated responses in speciesPred match the original response labels in species. The difference between measNew and meas contributes to the difference between speciesPred and species, as does the random nature of sampling the speciesPred values from the multinomial distributions associated with the predictor data.

    Load the fisheriris sample data set.

    load fisheriris

    The column vector species contains three iris flower species: setosa, versicolor, and virginica. The matrix meas contains four types of measurements for the flowers: the length and width of sepals and petals in centimeters.

    Fit a multinomial regression model using the measurements as the predictor data and the iris species as the response data.

    mdl = fitmnr(meas,species);

    mdl is a multinomial regression model object that contains the results of fitting a nominal multinomial regression model to the data.

    Use the rand function to generate new predictor data by adding uniformly distributed random noise to the measurements data.

    rng("default")  % Set the random seed for reproducibility
    sz = size(meas);
    measNew = meas + 5*rand(sz);

    Each row of measNew corresponds to predictor data for a unique data point.

    Generate random response values sampled from the multinomial distributions generated by passing measNew to the fitted model. For each data point in measNew, generate 100 samples from the corresponding multinomial distribution, and return the results in a table of counts.

    speciesPred = random(mdl,measNew,100,"counts")
    speciesPred=150×3 table
        setosa    versicolor    virginica
        ______    __________    _________
    
           0          31            69   
         100           0             0   
           0         100             0   
           0         100             0   
           0         100             0   
           0          78            22   
           0         100             0   
           0           0           100   
           0          98             2   
         100           0             0   
           0         100             0   
           0           0           100   
           0           0           100   
           0         100             0   
           0         100             0   
           0           0           100   
          ⋮
    
    

    Each row of the speciesPred table corresponds to a data point in measNew, and each table variable corresponds to a response category. The elements of speciesPred indicate how many times each species is sampled.

    Input Arguments

    collapse all

    Multinomial regression model object, specified as a MultinomialRegression model object created with the fitmnr function.

    New predictor input values, specified as a table or an n-by-p matrix, where n is the number of new observations, and p is the number of predictor variables used to fit mdl.

    • If XNew is a table, it must contain all the names of the predictors used to fit mdl. You can find the predictor names in the mdl.PredictorNames property.

    • If XNew is a matrix, it must have the same number of columns as the number of estimated coefficients. You can find the number of estimated coefficients in the mdl.NumPredictors property. You can specify XNew as a matrix only when all names in mdl.PredictorNames refer to numeric predictors.

    Example: random(mdl,[5.8 2.7; 5.7 2.5]) generates a simulated response from each multinomial distribution given by evaluating the two-predictor model mdl at the query points xq1 = [5.8 2.7] and xq2 = [5.7 2.5].

    Data Types: single | double | table

    Number of trials to simulate for each data point in XNew, specified as a positive integer scalar.

    Data Types: single | double

    Type for the returned simulated responses, specified as "samples" or "counts". To return the simulated response values as a cell array of class labels, specify type as "samples". To return the simulated response values as a table of counts for each response category, specify type as "counts".

    Data Types: char | string

    Output Arguments

    collapse all

    Simulated responses, returned as a cell array of category labels or a table of counts.

    • If you specify the input argument type as "samples", Ysim is an r-by-m cell array of category labels. r is the number of data points in XNew, and m is the number of trials specified in NumTrials.

    • If you specify the input argument type as "counts", Ysim is an r-by-N table, where N is the number of response variable categories in the data used to fit mdl.

    Algorithms

    The random function uses the fitted multinomial regression model object mdl to calculate response category probabilities for each data point in XNew. Each set of probabilities defines a multinomial distribution from which the function randomly samples response values for the corresponding predictor data.

    Version History

    Introduced in R2023a