Main Content

Stack or Unstack Dataset Arrays

This example shows how to reformat dataset arrays using stack and unstack.

Load sample data.

Import the data from the comma-separated text file testScores.csv.

ds = dataset('File','testScores.csv','Delimiter',',')
ds = 

    LastName            School                Test1    Test2    Test3
    {'Jeong'   }        {'XYZ School'}        90       87       93   
    {'Collins' }        {'XYZ School'}        87       85       83   
    {'Torres'  }        {'XYZ School'}        86       85       88   
    {'Phillips'}        {'ABC School'}        75       80       72   
    {'Ling'    }        {'ABC School'}        89       86       87   
    {'Ramirez' }        {'ABC School'}        96       92       98   
    {'Lee'     }        {'XYZ School'}        78       75       77   
    {'Walker'  }        {'ABC School'}        91       94       92   
    {'Garcia'  }        {'ABC School'}        86       83       85   
    {'Chang'   }        {'XYZ School'}        79       76       82    

Each of the 10 students has 3 test scores.

Perform calculations on dataset array.

With the data in this format, you can, for example, calculate the average test score for each student. The test scores are in columns 3 to 5.

ds.TestAve = mean(double(ds(:,3:5)),2);
ds(:,{'LastName','School','TestAve'})
ans = 

    LastName            School                TestAve
    {'Jeong'   }        {'XYZ School'}            90 
    {'Collins' }        {'XYZ School'}            85 
    {'Torres'  }        {'XYZ School'}        86.333 
    {'Phillips'}        {'ABC School'}        75.667 
    {'Ling'    }        {'ABC School'}        87.333 
    {'Ramirez' }        {'ABC School'}        95.333 
    {'Lee'     }        {'XYZ School'}        76.667 
    {'Walker'  }        {'ABC School'}        92.333 
    {'Garcia'  }        {'ABC School'}        84.667 
    {'Chang'   }        {'XYZ School'}            79   

A new variable with average test scores is added to the dataset array, ds.

Reformat the dataset array.

Stack the test score variables into a new variable, Scores.

 dsNew = stack(ds,{'Test1','Test2','Test3'},...
            'newDataVarName','Scores')
dsNew = 

    LastName            School                TestAve    Scores_Indicator    Scores
    {'Jeong'   }        {'XYZ School'}            90     Test1               90    
    {'Jeong'   }        {'XYZ School'}            90     Test2               87    
    {'Jeong'   }        {'XYZ School'}            90     Test3               93    
    {'Collins' }        {'XYZ School'}            85     Test1               87    
    {'Collins' }        {'XYZ School'}            85     Test2               85    
    {'Collins' }        {'XYZ School'}            85     Test3               83    
    {'Torres'  }        {'XYZ School'}        86.333     Test1               86    
    {'Torres'  }        {'XYZ School'}        86.333     Test2               85    
    {'Torres'  }        {'XYZ School'}        86.333     Test3               88    
    {'Phillips'}        {'ABC School'}        75.667     Test1               75    
    {'Phillips'}        {'ABC School'}        75.667     Test2               80    
    {'Phillips'}        {'ABC School'}        75.667     Test3               72    
    {'Ling'    }        {'ABC School'}        87.333     Test1               89    
    {'Ling'    }        {'ABC School'}        87.333     Test2               86    
    {'Ling'    }        {'ABC School'}        87.333     Test3               87    
    {'Ramirez' }        {'ABC School'}        95.333     Test1               96    
    {'Ramirez' }        {'ABC School'}        95.333     Test2               92    
    {'Ramirez' }        {'ABC School'}        95.333     Test3               98    
    {'Lee'     }        {'XYZ School'}        76.667     Test1               78    
    {'Lee'     }        {'XYZ School'}        76.667     Test2               75    
    {'Lee'     }        {'XYZ School'}        76.667     Test3               77    
    {'Walker'  }        {'ABC School'}        92.333     Test1               91    
    {'Walker'  }        {'ABC School'}        92.333     Test2               94    
    {'Walker'  }        {'ABC School'}        92.333     Test3               92    
    {'Garcia'  }        {'ABC School'}        84.667     Test1               86    
    {'Garcia'  }        {'ABC School'}        84.667     Test2               83    
    {'Garcia'  }        {'ABC School'}        84.667     Test3               85    
    {'Chang'   }        {'XYZ School'}            79     Test1               79    
    {'Chang'   }        {'XYZ School'}            79     Test2               76    
    {'Chang'   }        {'XYZ School'}            79     Test3               82       

The original test variable names, Test1, Test2, and Test3, appear as levels in the combined test scores indicator variable, Scores_Indicator.

Plot data grouped by category.

With the data in this format, you can use Scores_Indicator as a grouping variable, and draw box plots of test scores grouped by test.

figure()
boxplot(dsNew.Scores,dsNew.Scores_Indicator)

Revert the dataset array to the original format.

Reformat dsNew back into its original format.

dsOrig = unstack(dsNew,'Scores','Scores_Indicator');
dsOrig(:,{'LastName','Test1','Test2','Test3'})
ans = 

    LastName            Test1    Test2    Test3
    {'Jeong'   }        90       87       93   
    {'Collins' }        87       85       83   
    {'Torres'  }        86       85       88   
    {'Phillips'}        75       80       72   
    {'Ling'    }        89       86       87   
    {'Ramirez' }        96       92       98   
    {'Lee'     }        78       75       77   
    {'Walker'  }        91       94       92   
    {'Garcia'  }        86       83       85   
    {'Chang'   }        79       76       82    

The dataset array is back in wide format. unstack reassigns the levels of the indicator variable, Scores_Indicator, as variable names in the unstacked dataset array.

See Also

| | |

Related Examples

More About