How can I extract certain rows from a CSV imported table?
28 views (last 30 days)
Show older comments
I have imported CSV file into MATLAB
T = readtable('Historia_Rachunku.csv')
Extracted relevant part of it:
variablesByName = T(1:end,["Odbiorca","Data Księgowania","Kwota",])
And now I'd like to extract transactions belonging to single company. I tried this:
rowsByName = T(["PAYPRO S.A."],:)
But I get an error. How can I extract transactions based on their companies (ex. BILETY, PAYPRO, ZABKA) and then put them into separate tables?
0 Comments
Accepted Answer
dpb
on 4 Sep 2022
Unless the table is just huge or there really is never to be any need for the other variables, there's no reason to not keep the original as is -- or, you could set the import options object to only read the wanted variables first. But, that's an aside; to select the company of interest since it is data inside the variable you have to to a comparison test; it isn't a variable that can be named --
ixPayPro=matches(T.Odbiorca,"PAYPRO S.A."); % the indexing vector
TPayPro=T(ixPayPro,:); % make a specific table of those only
BUT...this really isn't the way to use the table -- instead, treat the variables of interest as grouping variables and operate on the table with them as described in the overview <grouped-calculations-in-tables>. You've not given a specific analysis of interest here, but rarely is actually breaking the table up the right way to proceed.
4 Comments
dpb
on 6 Sep 2022
"... wanted to extract transactions from particular companies, then built separate tables out of them and generate comparative plots and histograms ..."
And, yes, that's what the table functions and grouping variables are there for -- you don't need (and shouldn't) build actual different tables but use grouping variables and the splitapply workflow scenario (which often can be compressed into one-liners with rowfun or retime (on timetables) instead of dealing with multiple tables manually.
More Answers (1)
Walter Roberson
on 4 Sep 2022
In order for you to be able to do that using the syntax you tried, you would have needed to have told MATLAB that table() T should have 'RowNames', with the data giving by the Odbiorca column.
However, 'RowNames' must be unique, and it is not clear that your entries are unique. Your entries are not unique to the number of characters that we can see.
What you can do is
G = findgroups(T.Odbiorca);
grouped_tables = splitapply(@(varargin) {table(varargin{:}, 'VariableNames', T.Properties.VariableNames)}, T, G);
grouped_tables is now a cell array of tables, in which each table only has entries for a single company.
See Also
Categories
Find more on Spreadsheets in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!