3D array, remove columns where second array == -1

1 view (last 30 days)
Hi,
I am curently processing 13 channels of PSG data. I take my data and create 10 second buffers with 5 second overlaps for the length of the data. This returns a 3D array "windows3D" which is 2000*5147*13 (buffer width * no. of buffers * channels)
I have a second array called "truthBuffer" which is 1*5147, this array tells me if the 2000 samples have been identified by a clinican as positive (1) negitive (0) or are not scored (-1). I wish to preform Machine learning on this so I wish to remove the array elements of truthBuffer which are -1 and then remove the corresponding colums from all channels of windows3D.
This should return scoredTurthBuffer as a 1*2538 array (as there are 2609 '-1' values - this part works) and assessedWindows3D as a 2000*2538*13 (where the same 2538 colums are retained for all 13 channels) this part doesn't work
This is what i,ve tried below, this works well for "scoredTruthBuffer" (1*5147 ==> 1*2538) but "assessedWindows3D" changes from 2000*5147*13 to 1*133819391 array which is not what I expect. I am expecing an array of 2000*2538*13 back.
scoredTruthBuffer = truthBuffer; % Create a mirrior of truthBuffer || creates sized array 1*5147 as expected
assessedWindows3D = windows3D; % Create a mirrior of windows3D || creates sized array 2000*5147*13 as expected
toRemove = truthBuffer == -1; % Find what elements to remove in turthBuffer || 2609 '-1' values (5147-2609 = 2538)
scoredTruthBuffer(toRemove) = []; % remove elements from truthBuffer || creates sized array 1*2538 as expected
assessedWindows3D(toRemove) = []; % remove elements from truthBuffer || creates sized array 1*133819391 UNEXPECTED, expecting 2000*2538*13
I have also tried to include a for loop to preform this task channel by channel but I get the following error;
% Sudo
% for i = 1:13
% for j = 1:length(truthBuffer)
% if truthBuffer == -1
% windows3D(:, j ,i) = [];
% else
%
% end
% end
% end
% Error given
=>> A null assignment can have only one non-colon index.
I know the error is generated by the inclusion of both 'j' and 'i' but I cannot come up with a easy way around this.
How can I return "assessedWindows3D" as a 2000*2538*13 to match the columns removed in scoredTruthBuffer?
Thanks in advance,
Christopher

Accepted Answer

Dave B
Dave B on 10 Nov 2021
Edited: Dave B on 10 Nov 2021
From what I can see here, you've got a logical vector called toRemove, and you want to remove columns where toRemove is true from your matrix. But you didn't specify that it was columns which you wanted removed when you tried to remove from assessedWindows3D.
Here's a simpified example:
a = [1 -1 2 -1];
toRemove = a == -1;
assessedWindows3D = rand(2, 4, 2); % smaller version of assessedWindows3D so we can see it
assessedWindows3D
assessedWindows3D =
assessedWindows3D(:,:,1) = 0.4593 0.8598 0.1560 0.5102 0.8747 0.8497 0.6789 0.1773 assessedWindows3D(:,:,2) = 0.8688 0.0926 0.0603 0.0062 0.2144 0.5245 0.8124 0.3713
assessedWindows3D(:,toRemove,:) = []; % note this is indexing toRemove in the second dimension
assessedWindows3D
assessedWindows3D =
assessedWindows3D(:,:,1) = 0.4593 0.1560 0.8747 0.6789 assessedWindows3D(:,:,2) = 0.8688 0.0603 0.2144 0.8124
  3 Comments
Dave B
Dave B on 11 Nov 2021
Yes, and it's both neat and important! Here's a longish explanation about what's going on, because I think it's an important and interesting topic. And some experiments to help explain because it's always easier to understand this stuff with demos!
Note: for a more careful and edited explanation, there's a blog post by Steve Eddins and Loren Shure here.
Think of a matrix in MATLAB as a list of numbers with a shape. You can address the numbers based on row and column but also based on position in the list. The values in the list are arranged in column-major order which means that the items in the same column have adjacent indices in the list:
x = [2 3 6; 7 12 13]
x = 2×3
2 3 6 7 12 13
x(2,1) % second row first column
ans = 7
x(2) % second item
ans = 7
x(1,2) % first row second column
ans = 3
x(3) % third item
ans = 3
This kind of linear indexing is dead useful. For example, what if we wanted to change all the non-prime numbers in x to NaN?
x(~isprime(x))=nan
x = 2×3
2 3 NaN 7 NaN 13
Consider how you'd do this by looking at rows and columns: You could say x(2,2)=nan and x(1,3)=nan, but you can't do both at once: x([1 2], [2 3]) = nan would have replaced the 3 and 13 also. When you write x(~isprime(x)) you're referencing the values based on their position in the list rather than their row and column. This works because while ~isprime(x) is a matrix, x(~isprime(x)) is a vector:
x = [2 3 6; 7 12 13];
x(~isprime(x))
ans = 2×1
12 6
Here's another place that might be useful:
data = randn(20,100);
mean(data(data>0)) % note that data(data>0) isn't a matrix
ans = 0.7966
And of course will help you think about what reshape will do if you find yourself using it:
y=1:8
y = 1×8
1 2 3 4 5 6 7 8
reshape(y,2,4)
ans = 2×4
1 3 5 7 2 4 6 8
reshape(y,4,2)
ans = 4×2
1 5 2 6 3 7 4 8
reshape(y,2,2,2)
ans =
ans(:,:,1) = 1 3 2 4 ans(:,:,2) = 5 7 6 8
One important and powerful note: the order in the list is not just how you interact with them but how they're stored in memory. So two values that are on adjacent rows in the same column are next to eachother in memory, but two values that are on adacent columns in the same row are potentially very distant. This can have dramatic implications for performance.
Alright so now back to that 1 x 133819391 array....When you did: assessedWindows3D(toRemove) = []; MATLAB saw that you wanted to remove the true values of toRemove, but treated this as a linear index (i.e. remove them from the list of assessedWindows3D. It removed the values and (because the result was not necessarily a matrix any more) returned a vector. You tried to remove 2609 columns but instead removed 2609 values.
2000*5147*13 - 133819391
ans = 2609
Here's one more set of demos to tie it all together:
z=[1 2;3 4]
z = 2×2
1 2 3 4
z(3)=[] % remove the third value of z, which is row 1 col 2: 2
z = 1×3
1 3 4
z=[1 2;3 4]
z = 2×2
1 2 3 4
z([true false false true])=[] % remove the first and last values of z, 1 and 4
z = 1×2
3 2
z=[1 2;3 4]
z = 2×2
1 2 3 4
z([true false])=[] % remove the first value of z, note MATLAB will fill in false when your index isn't long enough!
z = 1×3
3 2 4
Hope that makes some sense!
Christopher McCausland
Christopher McCausland on 11 Nov 2021
Hi Dave,
Thank you for taking the time to answer that! It makes so much more sense now! Really good examples too!
Irronically I have used linear indexing before, but hadn't considered thats what MATLAB was doing. With your explanation this makes sense though!
Thank you so much!
Christopher

Sign in to comment.

More Answers (0)

Products


Release

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!