How do I obtain regression coefficients from a large data set?

I'm looking to obtain regression coefficients from three predictor variables (e.g. alpha1rad,alpha2rad, alpha3rad), where each variable is [101 x 20] (i.e. 101 data frames and 20 trials).
In matlab:
data =[alpha1rad(i,:);alpha2rad(i,:);alpha3rad(i,:)];
mn = mean(data,2);
dev = data-repmat(mn,1,N);
For one point in time my data was a [1 x 20] (i.e. one data frame for 20 trials) where each predictor variable (x) was:
x1 = dev (1,:); x2 = dev (2,:); x3 = dev (3,:);
before defining X as:
X = [ones(length(x1),1) x1' x2' x3'];
Therefore, how can I define X using the larger data sets (i.e.[alpha1rad(i,:);alpha2rad(i,:);alpha3rad(i,:); and would this require a for loop such as:
for i=1:101;
data = [alpha1rad(i,:);alpha2rad(i,:);alpha3rad(i,:);
end

3 Comments

Can you explain more about what you need? Here you appear to be getting a 20-by-4 matrix X using one row from three 20-column arrays (and defining a column of ones). When you say you want to use 101-by-20 arrays, what size do you want X to be?
Thanks for responding.
Like you say, the X for one point in time gave me a 20-by-4 matrix where each variable consisted of a [1 x 20] array. However, i'm looking to obtain coefficients for a whole movement trial now where each variable (e.g. x1) is a [101 x 20] array so i'm not what the size of X will be myself and how I could obtain the regression coefficients from this larger data set? The outcome variables (x, y, z) are also a [101 x 20] array so I was planning on using the following for loop:
for i=1:101; data = [x1(i,:);x2(i,:);x3(i,:)
Y = [x(i,:);y(i,:);z(i,:)];
end
to apply the following code to each row of data as a function:
[M,N] = size(data);
mn = mean(data,2);
dev = data-repmat(mn,1,N);
x1 = dev (1,:);
x2 = dev (2,:);
x3 = dev (3,:);
before creating X using:
X = [ones(length(x1),1) x1' x2' x3'];
Just to add to my last message, I'm using the following code to obtain the coefficients 4 predictor variables and two outcome variables:
function [VUCM,VUCMp,J]=regsoccer2(data,Y,d)
% Predictor variables
[M,N] = size(data);
mn = mean(data,2);
dev = data-repmat(mn,1,N);
x1 = dev (1,:);
x2 = dev (2,:);
x3 = dev (3,:);
x4 = dev (4,:);
% Y output variables
mnY = mean(Y,2);
devY = Y-repmat(mnY,1,N);
X = [ones(length(x1),1) x1' x2' x3' x4'];
B = X\devY';
J = B';
Z = null(J);
I'm also using the following umbrella code:
for i=1:101;
data = [x1(i,:);x2(i,:);x3(i,:);x4(i,:)];
Y = [x(i,:);y(i,:)];
[perp,para,J]=regsoccer1(data,Y,d);
VUCM(i)=[perp];
VUCMp(i)=[para];
end
Thanks

Sign in to comment.

 Accepted Answer

Perhaps you can build on this. Here I set up some fake data with a known relationship with a single outcome variable. Then I loop over all rows and compute the coefficients, and assemble them into a coefficient matrix. I look at the first few to make sure they capture the known relationship.
>> x1 = rand(101,20);
>> x2 = rand(101,20);
>> x3 = rand(101,20);
>> trial = (1:101)';
>> y = repmat(trial,1,20) + x1 + 2*x2 + 3*x3 + randn(101,20)/10;
>> b = zeros(4,101);
>> for j=1:101
X = [ones(20,1),x1(j,:)',x2(j,:)',x3(j,:)'];
Y = y(j,:)'; b(:,j) = X\Y;
end
>> b(:,1:5)
ans =
1.0319 1.8816 3.0233 4.0347 5.0205
0.9892 1.0496 1.0443 1.0919 0.9576
2.0266 2.0864 1.9609 1.9148 1.8656
2.9049 3.1052 2.9136 2.9238 3.1082
You could embellish this to add more outcome variables (more columns of the Y matrix) and to subtract means at any point.

7 Comments

Thanks for your reply Tom.
This is helpful. However, i'm looking for a slightly different output with regards to the coefficients.
With three predictor variables and two output variables (a [1 x 20] array for each variable) I obtained a [4 x 2] array of coefficients (i.e. where each column is for each outcome variable and each row is a predictor variable and a row for the constant), which I then used in a Jacobian Matrix for further analysis.
Therefore, i'm looking to obtain the same [4 x 2] array from a data set where each variable is a [100 x 20] array. Expanding on this process, I would be looking to obtain a [8 x 3] array of coefficients if I used seven predictor variables and three outcome variables, again using the larger data set where each variable is a [100 x 20] array. Is this coefficient output possible?
Is the idea that you would treat the entire [100x20] array as a single variable, as if it were a vector of length 2000, rather than 100 different length-20 variables to be analyzed separately?
The [100 x 20] array is basically 100 frames of a time normalised movement path over 20 trials for one predictor variable (e.g. knee angle). Based on the research paper i'm developing the code around, I could be using coefficients, where [100 x 20] is one of seven predictor variables for three output varibles (x, y, and z coordinates of the foot, with each a [100 x 20] array), or coefficients for each percentage of the time normalised movement path i.e.a coefficient for each [1 x 20] array with respect to each equivalent outcome variable at the same percentage. Sorry I can't be more specific. An example for both approaches would be very much appreciated even if it just using less predictor and outcome variables that I could expand upon later.
If you want to analyze all 100 frames and 20 trials together in a single fit (and I am not sure that is what you want to do), perhaps you just need to convert the arrays into vectors. For example, try this after the example I gave:
YY = bsxfun(@minus,y,trial); % take away "trial" effect
YY = YY(:);
XX = [ones(size(YY)),x1(:),x2(:),x3(:)] \ YY
Thanks for your help.
I've managed to calculate the coefficients for all 100 frames over 20 trials. This works using 3 predictor variables (each a [100 x 20] array and 2 outcome variables [again both a [100 x 20] array) giving [3 x 2] arrays of coefficients for each frame. The coefficents are then transposed into a Jacobian (J) and the null of J (Z = null(J)) is used for further analysis.
However, when I add an extra predictor variable (x4 = [100 x 20] array) I get the following response:
??? Error using ==> mtimes
Inner matrix dimensions must agree.
Error in ==> regsoccer1 at 50
UCM(:,i) = (Z'*dev(:,i))*Z;
Any help would be appreciated.
Just adding to my last comment, I used the following function:
function [VUCM,VUCMp,J]=regsoccer1(data,Y,d)
% Predictor variables
[M,N] = size(data);
mn = mean(data,2);
dev = data-repmat(mn,1,N);
x1 = dev (1,:);
x2 = dev (2,:);
x3 = dev (3,:);
% Y output variables
mnY = mean(Y,2);
devY = Y-repmat(mnY,1,N);
X = [ones(length(x1),1) x1' x2' x3'];
B = X\devY';
J = B';
Z = null(J);
and the following umbrella code:
for i=1:101;
data = [x1(i,:);x2(i,:);x3(i,:)];
Y = [x(i,:);y(i,:)];
[perp,para,J]=regsoccer1(data,Y,d);
VUCM(i)=[perp];
VUCMp(i)=[para];
end
Hi Tom,
if you're not already tired of my questions I have a bit more information to hopefully make a bit more sense. Any help would be appreciated.
I'm trying to see how joint angles of the right leg (predictor variables, X = x1, x2, x3 for the hip, knee, and ankle angles respectively each a [1 x 20] array) could potentially stablise the position of the right foot (outcome variables Y = x,y coordinate positions: both a [1 x 20] array) using linear regression.
"dev" (a [1 x 20] array for each predictor variable) is the deviations of joint angles from the mean joint angle configuration at each trial and projected onto the null-space or null(J)(Z = null(J)) using the following code:
for i = 1:N
UCM(:,i) = (Z'*dev(:,i))*Z;
end
The UCM is used to look at the control of a movement and is approximated linearly using the null space (Z) of the J matrix.
This code works for one frame (i.e. a [1 x 20] array) and a whole normalised movement cycle (i.e. a [101 x 20] array) for 3 predictor varibles (PV's) and 2 output variables (OV's) from a [3 x 1] Z array.
However when I increase the number of PV's to 4 (resulting in a [4 x 2] Z array) and 5 (resulting in a [5 x 2] Z array) with 2OV's, I get the following warning:
???Error using mm>mtimes
Inner matrix dimensions must agree
Therefore, i'm unable to analyse any data above three PV's.

Sign in to comment.

More Answers (0)

Categories

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!