Main Content

Screen Risk Factors

Remove risk factors from data in Modelscape

Since R2021b

Description

Use the Modelscape™ Screen Risk Factors task to automatically remove risk factors from a data table based on their predictive power relative to a binary response variable. Feature selection is an important step in the development of a statistical model. Input data can have hundreds or thousands of variables, and discarding some variables often improves model interpretability, training times, and other important attributes. The task automatically generates MATLAB® code for your live script. This task requires the Modelscape for MATLAB support package.

Using this task, you can:

  • Inspect summary statistics and histograms for variables in a data table.

  • Use customizable screening criteria to analyze the predictive power of variables.

  • Remove variables from a data table and record the corresponding reason for exclusion.

  • Record reasons for including variables in a data table.

  • Export the resulting subtables to MATLAB desktop.

For general information about Live Editor tasks, see Add Interactive Tasks to a Live Script.

Screen Risk Factors live task

Open the Screen Risk Factors

To add the Threshold Predictors task to a live script in the MATLAB Editor:

  • On the Live Editor tab, select Task > Screen Risk Factors.

    Select Screen Risk Factors live task

  • In a code block in the script, type a relevant keyword, such as screen. Select Screen Risk Factors from the suggested command completions.

    Select Screen Risk Factors live task

Parameters

expand all

Input table must be a MATLAB table or a timetable. The columns of Input table contain the variables for different data points, for example, Residence Status or Customer ID.

Response variable must be a binary variable in the input table. The task evaluates the risk factors in the input data table based on their power to predict this response variable.

Criteria must be an object containing the criteria against which to screen the input variables. You can use the predefined criteria or customize your own screening criteria. For more details, see Screen Risk Factors by Custom Criteria.

Check the Filtered table check box to display the subtable after excluding the removed variables. The filtered table contains the columns from the Input table without the variables that you mark for exclusion.

Check the Preview summary tables check box to display two tables of additional information about the feature selection process. The exclusionSummaryPreview table includes all the data of the input table together with the exclusion flags and comments that you record in the task. The progressSummaryPreview table shows the total number of variables that are present, excluded, included, and commented against.

Version History

Introduced in R2021b