## how to find mean values?

### Thar (view profile)

on 17 Jan 2015
Latest activity Commented on by Star Strider

on 18 Jan 2015

### Star Strider (view profile)

I have two columns with data:
2,40710000000000e+18 0,000555555555555642
2,40590000000000e+18 0,000555555555555642
2,05070000000000e+18 0,000555555555555642
2,41970000000000e+18 0,555555555555642
2,48580000000000e+18 0,000555555555555642
2,51880000000000e+18 0,000277777777777821
2,74900000000000e+18 0,216666666666667
3,82680000000000e+18 0,000555555555555642
3,05860000000000e+18 0,000555555555555642
3,72440000000000e+18 0,555555555555642
I want to find mean values from the 1st column according with the data from the second column. The second column has data <0.01. Nevertheless, there are data >0.01. I want to find mean from the first line in the 1st column until the line where the number in the second column is >0.01, without this line. For example,from the first column 1:3 lines, then 4:6, 7:9 and so on (the step is not stable).
I have the script :
for k=1:4828
if G(k,2)<=0.01 ;
idx=G(k,2)<=0.01;
N=mean(G(idx,1));
end
end
This script doesn't work correct. Is there any idea?
Thank you!

Star Strider

### Star Strider (view profile)

on 17 Jan 2015
‘from the first column 1:3 lines, then 4:6, 7:9 and so on’
You intended to say: ‘from the first column 1:3 lines, then 5:6, 8:9 and so on’, omitting all rows with values in the second column >0.01.
Correct?

### Star Strider (view profile)

on 17 Jan 2015

This works:
M = [ 2.40710000000000e+18 0.000555555555555642
2.40590000000000e+18 0.000555555555555642
2.05070000000000e+18 0.000555555555555642
2.41970000000000e+18 0.555555555555642
2.48580000000000e+18 0.000555555555555642
2.51880000000000e+18 0.000277777777777821
2.74900000000000e+18 0.216666666666667
3.82680000000000e+18 0.000555555555555642
3.05860000000000e+18 0.000555555555555642
3.72440000000000e+18 0.555555555555642];
Idx = find(M(:,2) > 0.01);
rows = diff([0; Idx]);
Mc = mat2cell(M, rows, 2);
for k1 = 1:size(Mc,1)
Colmean1(k1) = mean(Mc{k1}(1:end-1,1));
end
It works by finding the elements >0.01 in ‘Idx’, then using that information to create a cell array from your matrix, using the mat2cell function, then calculating the column 1 means (in ‘Colmean1’) by taking the means of each cell. (I copied the matrix here because in my location, the dot (.) is the decimal separator.)
I checked the means manually, and they appear to be the values you want.

Show 1 older comment
Star Strider

### Star Strider (view profile)

on 18 Jan 2015
My pleasure!
I would have to see your matrix to address that error specifically. (My code works without error in the segment you posted.)
As an interim fix, change the ‘Idx’ assignment to this:
Idx = [find(M(:,2) > 0.01); size(M,1)];
If you’re still having problems after that change, use the save command to save your matrix as a .mat file, attach it to your next comment using the ‘paperclip’ icon, and I’ll see what I can do.
Thar

### Thar (view profile)

on 18 Jan 2015
Thank you!
That was the error!!
Star Strider

### Star Strider (view profile)

on 18 Jan 2015
My pleasure!
If that fixed the problem, then the last cell mean is not being computed correctly. Change the loop to:
for k1 = 1:size(Mc,1)
Colmean1(k1) = mean(Mc{k1}(1:end-1,1));
if k1 == size(Mc,1)
Colmean1(k1) = mean(Mc{k1}(1:end,1));
end
end
and all the calculations will be correct. (The change reflects the fact that the very last element in column 1 is <=0.01, so it should be included in the calculation.)

### Geoff Hayes (view profile)

on 17 Jan 2015

Thodoris - your condition is to check whether the kth element of the second column is less than 0.01, and if so, then your idx variable is set to one as
idx=G(k,2)<=0.01;
idx will always be of the logical data type, 0 or 1, since it is the result of a comparison of two numbers. Your calculation for the mean is then just that for G(1,1) and so is always the same. If you want to use your above code, then you probably should be looking for those elements in the second column of G that are larger than 0.01. When you find one, then you know to compute the average of the previous set of elements i.e. those from the last occurrence of a value greater than 0.01 to one less than the current index (which corresponds to a value that is greater than 0.01). Something like the following
% create an array to store all averages
averages = [];
% create an index that will be used to grab sets of data
% from the last value greater than 0.01 to the current index
% less one
idx = 1;
% iterate over each row of G
for k=1:size(G,1)
if G(k,2) > 0.01
% average the last set of elements from idx to the current index k
% and save to our array
averages = [averages ; mean(G(idx:k-1,1))];
% remember where the next set to be averaged starts
idx = k;
end
end
% calculate the last average
averages = [averages ; mean(G(idx:end,1))];
Consider alternatives to the above using find to determine where in your array are those elements of the second column larger than 0.01. Such as
% get indices of those elements from the second column that are greater than 0.01
idcs = find(G(:,2)>0.01);
% determine the start indices for each set
setStartIdcs = [1 ; idcs];
% determine the end indices for each set
setEndIdcs = [idcs - 1 ; size(G,1)];
% calculate the average
averages = arrayfun(@(x)mean(G(setStartIdcs(x):setEndIdcs(x),1)),1:length(setStartIdcs));
Once we have found the indices of those elements greater than 0.01, we can then determine the start and end indices for each of the sets to average. The start indices are just those from idcs with a one prepended to it (like how we initialized idx to be one in the first example), and the end indices are just idcs less one from each index (like the k-1 we used previously) with the last index (the number of rows of G) appended to the end of it (like our last statement outside the above for loop). We then use arrayfun to calculate the average of each set...and we are done!

on 17 Jan 2015
Edited by dpb

### dpb (view profile)

on 17 Jan 2015

Presuming rest meets the above example of only a single demarcation value between sections; if there can be a sequence of those adjacent you'll have to cull them out. I can't precisely determine from the description whether the greater value row is or is not to be included; the 1:3 first set doesn't, yet the 4:6 second does. I made the assumption the 4:6 should've been 5:6 instead owing to the phrase "...without this line" as the last condition given.
idx=find(G(:,2)>0.01); % the breakpoints
M=zeros(1,length(idx); % allocate space for the means
i1=1;
for i=1:length(idx)
i2=idx(1);
M(i)=mean(G(i1:i2,1));
i1=i2+1;
end
If after loop finishes i1<length(G) then you've got one more group to add from there to end. You can test this at the beginning by seeing if
idx(end)==length(G)
I'll leave these details as "exercise for the student"... :)