How to add numbers (unevaluated), calculated by analysing a tall table, to a pre-allocated matrix?
2 views (last 30 days)
Show older comments
In a column of a tall table I have a set number of different strings. I'd like to count the number of times each of these strings occur in the column and calculate the percentage of each string's occurence. These numbers I'd like to save inside a pre-allocated matrix with which I'd like to create a bar plot. All this I'd like to do without having to gather the full tall column beforehand.
I'm using R2016b.
Here's an example code of what I'd like to accomplish:
%Segments is the list of different strings
%Cars is a tall cell column of a table, containing the data / strings
Cars = gather(DB.CarSegment); %This step I'd like to omit
NrCarsPerSeg = zeros(2,size(Segments,2));
%This matrix stores the number of occurences and the percentages.
for seg = 1:size(Segments,2)
NrCarsPerSeg(1,seg) = sum(strcmp(Cars, Segments{1,seg}));
end
% Percentage:
NrCarsPerSeg(2,:) = (NrCarsPerSeg(1,:) / sum(NrCarsPerSeg(1,:))) * 100;
barPlot = bar(diag(NrCarsPerSeg(2,:)), 'stacked');
The upper code so far only works when Cars has been gathered and is thus not a tall table anymore. However, as the data gains in size, this might not always be feasible and that's why I'd rather have Cars stay an unevaluated tall table.
-----------------------------------------------
The following I've tried:
- Without gathering just like that it gives me this error:
The following error occurred converting from tall to double:
Conversion to double from tall is not possible.
- Declare NrCarPerSeg a tall matrix
NrCarsPerSeg = tall(zeros(2,size(Segments,2)));
inside the for loop this gives me the following error:
For A(m,n,...) = B, m must be either a colon (:) or a tall logical vector.
To circumvent that error I created a tall index vector:
idx = tall(logical([1 0])');
This gives me the following error inside the for loop:
In the assignment A(m,n,...) = B, B must be a scalar value.
The result of 'sum(strcmp(Cars, Segments{1,seg}))' is a 'tall double (unevaluated)' while NrCarsPerSeg(idx,seg) is a "evaluated" 'tall double'. This is probably the crux of the problem. Is there a way to solve this?
Thanks a lot for reading!
0 Comments
Answers (2)
Steven Lord
on 13 Sep 2017
Have you considered trying to make a tall histogram, perhaps combined with a preprocessing categorical call to group the text data in your tall table into categories and make them a categorical array?
1 Comment
See Also
Categories
Find more on Tables in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!