How to add elements to specific rows in existing array without using for-loop?

25 views (last 30 days)
Hello everybody!
I have an existing cell array. For example:
Data =
[0.7482] [0.8258] [0.9619]
[0.4505] [0.5383] [0.0046]
[0.0838] [0.9961] [0.7749]
[0.2290] [0.0782] [0.8173]
[0.9133] [0.4427] [0.8687]
[0.1524] [0.1067] [0.0844]
Into an additional column, I would like to add some other data from another cell array. This additional cell array has two columns. The first one identifies to which row the data belongs, while the seconds one contains the additional data. There is not necessarily additional data to every row. On the other hand there might be also two or more values belonging to one row, which should all end up in one additional cell.
Example of additional data:
Additional_Data =
[5] [10]
[1] [20]
[2] [30]
[2] [40]
Expected outcome:
Complete_Data =
[0.7482] [0.8258] [0.9619] [20]
[0.4505] [0.5383] [0.0046] [1x2 double]
[0.0838] [0.9961] [0.7749] []
[0.2290] [0.0782] [0.8173] []
[0.9133] [0.4427] [0.8687] [10]
[0.1524] [0.1067] [0.0844] []
with the [1x2 double] cell beeing [30 40]
I do not want to use a for-loop. It would also be nice to have an efficient solution, since my original data might have up to 10 000 000 rows, and several data sets will be processed. So run time could be an issue.
Thanks in advance!
--EDIT--
See comment below for clarification.
  2 Comments
Matt J
Matt J on 17 May 2013
Edited: Matt J on 17 May 2013
I'm skeptical you're going to find an efficient way to do anything with this data storage scheme. Cell arrays are not made for holding 10000000 rows of anything. They store data non-contiguously in RAM, so data access becomes very slow and inefficient when you get up to array sizes like 10000000.
Also, all MATLAB functions that manipulate cell arrays use for-loops, or something equivalent in performance, internally. There are no options for fast, vectorized analysis of cell arrays.
It might be a good idea to explain the purpose of storing the data this way. For example, why not just use the UNIQUE command to extract groups of data?
Timo W
Timo W on 17 May 2013
Thanks for the answers!
The proposed solutions worked fine, for the question I asked, but not for my real problem. Though I tried to get some inspiration from them I did not fine a working solution for me. I think I need to clarify the question (sorry for not being specfig enough in the first place):
-- In my generic example the additional data contained doubles only. My real data contains doubles in the first column but characters in the second one.
It might look like this:
Additional_Data =
[5] 'AA'
[1] 'BB'
[2] 'CC'
[2] 'DD'
So functions which accept doubles only cannot be used.
-- About the concern that cells are not the best way of data storage here: I agree. But the data I have is storged like this. I do not insist on keeping it this way while processing it.
-- What I eventually want to do is merge the two cell arrays and finally store the combined data in a struct array with 4 fields in each row (parameter_1, parameter_2, parameter_3, additional_information).
So in my example I would want to get something like this:
Final_Data =
6x1 struct array with fields:
parameter_1
parameter_2
parameter_3
additional_information
With Final_Data.additional_information containing:
Final_Data(1).additional_information = ['BB']
Final_Data(2).additional_information = ['CC' 'DD']
Final_Data(3).additional_information = []
etc.

Sign in to comment.

Accepted Answer

Matt J
Matt J on 17 May 2013
Edited: Matt J on 17 May 2013
This might be a better option for you than adding a 4th column to Data
j=cell2mat(Additional_Data(:,1));
i=1:length(j);
s=cell2mat(Additional_Data(:,2));
S=sparse(i,j,s);
Now instead of doing
x = Complete_Data(2,4),
you would do
x=nonzeros(S(:,2)),
  3 Comments
Matt J
Matt J on 18 May 2013
Edited: Matt J on 18 May 2013
Regarding converting to a struct, it would not be a good idea to have one struct element per row. It would be best to combine all of your original numeric data into a single array and put that in a field of a single struct
Final_Data.parameters=cell2mat(Data);
As for dealing with the case where Additional_Data contains strings, here is a generalization,
j=cell2mat(Additional_Data(:,1));
i=1:length(j);
s=Additional_Data(:,2);
S=sparse(i,j,i);
map=find(any(S,1)),
Final_Data.additional_information=cell(size(Additional_Data,1),1);
Final_Data.additional_information(map)=arrayfun(@(i) s(nonzeros(S(:,i))).',map,'uni',0)
Timo W
Timo W on 20 May 2013
Thank you, your input helped a lot!
I finally went for this solution:
Final_Data = Data;
j = cell2mat(Additional_Data(:,1));
i = 1:length(j);
s = Additional_Data(:,2);
S = sparse(i,j,i);
map = find(any(S,1));
Final_Data(map,end+1) = arrayfun(@(idx) s(nonzeros(S(:,idx))).', map, 'UniformOutput', false)
param_names = {'parameter_1' 'parameter_2' 'parameter_3' 'additional_information'};
Data_Struct = cell2struct(Final_Data, param_names , 2);
Also thanks for pointing out that this might be an inadequate way of storing this data. I'm not used to working with this big amounts of data and did not think about it.
But I think for my code this will not be an issue, since I will not actually work a lot with the data, once I stored it. I only need to draw a few data sets from the struct. So I kept my structure, since I find it to be more intuitive storage scheme for my specific data.
Anyway I will keep this issue in mind for the future.

Sign in to comment.

More Answers (1)

Azzi Abdelmalek
Azzi Abdelmalek on 17 May 2013
Edited: Azzi Abdelmalek on 17 May 2013
D=num2cell(D);
A_D=num2cell(A_D);
N_D=arrayfun(@(x) A_D(cell2mat(A_D)==x,2),(1:size(D,1))','un',0)

Categories

Find more on Structures in Help Center and File Exchange

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!