# Split and group data in Array

6 views (last 30 days)
Luke on 3 Jan 2019
Commented: Cris LaPierre on 4 Jan 2019
Hello,
I have array of size 2200056*1 and it is plotted below.
I want to split the array based on different steps.
Goal: I want to find index in figure2 based on input signal array (figure1).
In this case, from input of multiple steps(figure 1), the output should be index numbers 1,2,3,4,7,9,10,11
If array consist only one step signal then i can find index easily by taking median(array). But if the array has multiple sequences then how can we achieve it? I think if i can split data in to groups then it would be possible but not sure how to split it?  Cris LaPierre on 3 Jan 2019
Edited: Cris LaPierre on 3 Jan 2019
Here's one way if you want the values to be rounded to represent what is in your input speed array.
% Round values to remove perturbations
p10 = floor(log10(speed));
% Set resolution
p10(p10<1)=1; % smallest increment is 10
p10(p10>3)=3; % largest increment is 1000
tmp = speed./10.^p10;
newSpeed = round(tmp).*10.^p10;
% find and remove transitions
idx = find(diff(newSpeed)>0);
idx([inf; diff(idx)]>5000) = [];
newSpeed(ismember(newSpeed,newSpeed(idx)))=[];
% remove tail
idx = find(diff(newSpeed)<0);
newSpeed(idx(1):end)=[];
steps = unique(newSpeed)
steps = 8×1
0
100
600
1000
4000
8000
10000
13000

Cris LaPierre on 3 Jan 2019
Edited: Cris LaPierre on 3 Jan 2019
Use findgroups and splitapply functions.
Here is a simple example provided in the documentation
% Load table containing info for 100 patients
% Specify groups by gender with findgroups.
G = findgroups(Gender);
% Split Height into groups specified by G.
% Calculate the mean height by gender.
splitapply(@mean,Height,G)
ans = 2×1
65.1509
69.2340
Cris LaPierre on 3 Jan 2019
Edited: Cris LaPierre on 3 Jan 2019
This approach will need some modification if your data is not just the step values (e.g. if it also contains values corresponding to the transition between each step). What to do in that scenario will depend on the actual data. Luke on 4 Jan 2019
I have come up with different logic as follow which gives same result.
for i=1:1:length(speed)
speed_filt(i) =round(speed(i),-2);
end
speed_filt = speed_filt';
[n,bin] = hist(speed_filt,unique(speed_filt));
if isempty(n)
inStruct.ind = 0;
else
[~,idx] = sort(-n);
temp1 = n(idx); % count instances
temp1 = temp1';
temp2 = bin(idx); % corresponding values
for i=1:length(temp1)
if temp1(i)> 2000
inStruct.ind(i) = temp2(i);
end
end
end
Cris LaPierre on 4 Jan 2019
Nice!