Creating subtable from larger table with criteria

11 views (last 30 days)
I am importing a large excel document (multiple sheets, but focused on first sheet).
I wanted to create a subtable from the large imported table with specific criteria. My goal is to extract columns of data that fit name criteria and create a new table.
I hope to pull columns based on name - i.e. if the column (row 1) contains the words 'insured', 'policy', 'date' take those columns to the new subtable..an issue I am running into is that I want to pull all columns that contain those words and not verbatim.
(If column has 'name of insured' I still want to be able to pull that column - so I can apply program to different word layouts)
My initial thoughts were to use an if loop
  2 Comments
Adam Danz
Adam Danz on 22 Jul 2019
There is no such thing as an 'if loop'. You can use conditionals (if, then, else) but there are better options.
to name a few.
You mentioned that you are working with a table but then you mention, "I hope to pull columns based on name - i.e. if the column (row 1) contains the words..." That makes we wonder if you're really working with tables or rather, cell arrays. Categories such as "insured", "policy", "date" etc seem line table headers in which case they would not be in the first row. Instead they would be headers and the first row would contain data just like any other row.
A table looks like this
Gender Age Smoker
______ ___ ______
Male 38 true
Male 43 false
Female 38 false
Female 40 false
Female 49 false
A cell array with a header row looks like this
6×3 cell array
{'Gender'} {'Age'} {'Smoker'}
{'Male' } {[ 38]} {[ 1]}
{'Male' } {[ 43]} {[ 0]}
{'Female'} {[ 38]} {[ 0]}
{'Female'} {[ 40]} {[ 0]}
{'Female'} {[ 49]} {[ 0]}
The solution depends on which data type you're working with.
GraceH
GraceH on 22 Jul 2019
Oh okay, I am working with a table with headers - and want to extract only certain columns based on headers

Sign in to comment.

Answers (1)

Adam Danz
Adam Danz on 22 Jul 2019
Edited: Adam Danz on 29 Jul 2019
This should get you started (see addendum at the end). The approach is to search for a key word that must be within the header (and no other header). The key word is not case sensitive. The table headers are then renamed so that they all have the same headers based on the list of keys. Extra fields are removed.
% Create two fake tables with varying column names and order
T1 = table({'tom';'jane';'hans'},[188;55;109],floor(now)-[1;2;3],[0;0;0],...
'VariableNames',{'insured','policy','date','trash'});
T2 = T1;
T2.Properties.VariableNames(1:2) = {'NameOfInsured', 'PolicyNumber'};
% List key words to search for (not case sensitive). These will also become the new
% column headers.
keys = {'date','insured','policy'};
% Given a table (T) and key string (k), this function will output the column number
% in T that contains k.
colIdxFcn = @(T,k)find(~cellfun(@isempty,regexpi(T.Properties.VariableNames,k)));
% Loop through all tables and 1) rename columns to match key and 2) resort them so
% the columns are in the same order and 3) remove unneeded columns
T = {T1,T2}; % Put all tables into a cell array
Tout = cell(size(T));
for i = 1:numel(T)
colIdx = zeros(size(keys));
for j = 1:numel(keys)
colIdx(j) = colIdxFcn(T{i},keys{j});
end
Tout{i} = T1(:,colIdx);
Tout{i}.Properties.VariableNames = keys;
end
Tout{n} contains the n_th table with the renamed headers in the order specified by 'keys'.
[addendum]
This is the data cleaning phase. Data cleaning is the least enjoyable and most time consuming part of analysis. If you have more than 1 table header that contains a key word, you'll need to add additional steps to the process. For example, if you have headers in a table named "policy_name" and "policy_date", this solution will not be enough. A complete solution will require looking at individual circumstances that only you can address. It may require loops that search for duplicate matches as in the example I just shared and renaming those headers so they become unique.

Categories

Find more on Data Import from MATLAB in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!