Main Content

Remove Risk Factors

This example shows how to remove or include variables from a table and record the corresponding reasons using the Modelscape™ Remove Risk Factors task.

The example also shows how to include the results of this analysis in model documents using the Modelscape reporting feature.

All columns in a table of input data may not be relevant while developing a statistical model. Not all the data in the table is necessarily usable for a statistical model. For example, randomized user identifiers (IDs) are often irrelevant, legally sensitive data such as ethnic origin or religious beliefs cannot be used, and some data can be of poor quality. This example shows you how to select relevant variables in such a table and record your reasons.

This example uses the Credit Scorecard data set, which contains three tables of customer information such as age, income, and employment status. One such table, dataMissing, deliberately has a few blank entries in the data set. The data could be used for developing a statistical model such as a MATLAB® credit scorecard model. The example loads the data set in the Remove Risk Factors task, marks some variables for exclusion, and documents the results using Modelscape reporting.

Load Data and Launch the Tool

Load the input data from CreditCardData.mat.

load CreditCardData

Open a new live script. There are two ways to open the Remove Risk Factors task:

  1. Type remove and select Remove Risk Factors in the drop-down selection.


2. Search for the tool under Task in the Live Editor gallery.

In the task, select your input data, for example dataMissing variable.


Inspect and Filter Variables

The task shows the summary statistics and the histogram for the first variable in the table (in this case CustID).

To inspect other variables, click the corresponding variable name in the Analyze data variables section. This section contains three columns that you can sort. The Variable Names column is read-only. The Exclude column allows you to exclude variables from the table. To do this, check the Exclude button to mark the corresponding variable for removal. The Comment column lets you add reasons for the exclusion (or inclusion) by double-clicking the box.


When you exclude variables and add comments, the task dynamically produces two outputs:

  • filteredTable: This is a subtable of the input table without the excluded risk factors. Use this subtable in the next step of the model development process - for example feature selection.

  • exclusionTable: This table includes all the data of the input table together with the exclusion flags and comments in the task. To view this information, tick the 'Preview summary tables' box in 'Display results' section. This information is stored in exclusionTable.Properties.CustomProperties meta data.


progressSummaryPreview lists the total number of variables, the excluded variables, the included variables, and the number of variables with comments. You can use this last datum to indicate whether the removal process is complete - in the end, every variable must have a reason for either exclusion or some indication that the variable has been inspected.

Document with Modelscape Reporting

Use Modelscape Reporting to document the findings of the analysis described above. Use the meta data stored in exclusionTable for this purpose. To include the tables shown above as exclusionSummaryPreview and progressSummaryPreview in a Word document, create document holes with titles ExclusionSummary and ProgressSummary in the Word document.

[ExclusionSummary, ProgressSummary] = summarizeExclusionTable(exclusionTable)

To create document holes in a Word document, view the Developer tab, and click the 'Rich Text Content Control' symbol Aa in the Controls area. Then click 'Properties' and fill in the Title fields.

Running fillReportFromWorkspace will then pick up these new variables from the MATLAB workspace and insert them into the model document.

For more information on fillReportFromWorkspace, see Model Documentation in Modelscape.