Multivariate Linear Regression with Weibull errors

6 views (last 30 days)
Hi,
I am trying to run a linear regression where the dependent variable lies between 0 and 1. The errors are distributed by 2 parameter Weibull distribution. So essentially I will be estimating 4 parameters: a constant, a parameter for an independent variable and 2 parameters of the weibull distribution. Can somebody guide me as to how I proceed to do this? I am mostly getting search results for survival data but my data is non-survival data. Thanks!

Accepted Answer

John D'Errico
John D'Errico on 8 May 2016
Do some reading about maximum likelihood estimation . This is the general technique one uses for an estimation problem where the errors in the process are assumed to follow some known distribution that is not Gaussian.
Sorry, but I won't write the code for you. One hint I will offer is that it is almost always better to maximize the log (i.e., minimize the negative of the log) of the likelihood function, as otherwise the problem quickly becomes intractable in double precision arithmetic.
  4 Comments
John D'Errico
John D'Errico on 8 May 2016
Edited: John D'Errico on 8 May 2016
Let me explain in a little more detail what this will involve.
You need to have an objective function that for ANY set of 4 parameters, thus two line parameters, and two Weibull parameters...
1. Compute the residual. That MUST be a positive number for all data points, so you need to use a tool than can handle constraints.
2. Given the set of PRESUMED Weibull parameters, compute the Weibull PDF for each of those residuals, for those given parameters.
3. Return the product of those individual values from the PDF. This may be a very tiny number, in fact, you will almost always see underflows in this. So it is better if you compute the sum of logs of those values, and maximize that sum instead.
So this will be a 4 parameter optimization, subject to linear inequality constraints at each data point, as well of course, on the Weibull parameters since the shape and scale parameters for the Weibull must both be positive.
The best tool for the optimization would arguably be something like fmincon, although even there you need to be careful, since fmincon does allow for tiny constraint deviations. It's been a while since I looked, but one of the algorithms in fmincon may assure strict adherence of the constraints.
Prachi  Singh
Prachi Singh on 9 May 2016
Thank you so much for your guidance. I do accept that the method outlined by me is inferior to a full mle (using 4 parameters), but I don't know why it was suggested in a Biometrika paper.
I am relatively new to matlab, but I will try to write a code for full mle of the model with constraints. fmincon does seem to be the correct choice but I have as yet not been able to write the correct code.
Thank you again, for pointing out my mistake, I was trying to jump to an easy solution while frantically looking for a way to solve the problem.

Sign in to comment.

More Answers (1)

Prachi  Singh
Prachi Singh on 9 May 2016
Edited: Prachi Singh on 9 May 2016
Hi John,
I am trying to run the below code -
--------------------
%Log likelihood function for 4 parameter model(2 line and 2 Weibull parameters)
%p(1), p(2) are Weibull shape parameters & p(3) is line intercept parameter, p(4) is slope parameter
function val=tryone(p,xdata, ydata)
p = [p(1), p(2), p(3), p(4)];
val = -sum(log(p(1))) -sum(log(p(2))) -(p(2)-1)*sum(log(ydata - p(3) - p(4)*xdata))+ p(1)*sum((ydata - p(3) - p(4)*xdata).^p(2));
%Constraint for Weibull errors to be positive
function [c, ceq] = errors(p,xdata,ydata)
c = ydata - p(3) - p(4)*xdata;
ceq = [ ];
%Code:
p_initial = [0.1, 0.1, 0.1, 0.1];
options = optimoptions(@fmincon,'Display','iter','Algorithm','interior-point');
[x,fval] = fmincon(@tryone,p_initial,[],[],[],[],[],[],@errors,options)
---------------------------------------
I am encountering few problems:
1) I don't know how to incorporate constraint for Weibull parameters to be positive
2) I am getting an error "Not enough input arguments" for this command (loglikelihood function mentioned above) -
val = -sum(log(p(1))) -sum(log(p(2))) -(p(2)-1)*sum(log(ydata - p(3) - p(4)*xdata))+ p(1)*sum((ydata - p(3) - p(4)*xdata).^p(2));
I am really new to Matlab and trying my best to understand things, please let me know if I have committed an error while writing the code.
Thanks alot for your patience!

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!