Fitting the data to power law using least square method
20 views (last 30 days)
Show older comments
Mammadbaghir Baghirzade
on 11 Feb 2021
Edited: Mammadbaghir Baghirzade
on 12 Feb 2021
Hi all,
I try to fit the attached data in the Excel spreadsheet to the following power law expression using the least square method.
I aim to obtain a, m and n.
In the attached Excel datasheet I have y, T and P.
y = a * ( (T/298) ^m ) * ( (P/1.2) ^n )
I have double checked the forum to see any similar questions, but the one I found was for different power law expression, in which I couldn't implement my own equation into it.
I wondered if you could suggest me any approach in MATLAB.
Thank you for your time.
Regards
0 Comments
Accepted Answer
the cyclist
on 12 Feb 2021
Here is code that will do the fit in the original space, without applying the log transform first:
% Load the data
data = readtable("power_law_using_least_square.xlsx");
% Strip off problematic first data point
data(1,:) = [];
% Define convenient variables
Y = data.Y;
% Put explanatory variables into an array
X = [data.T/298 data.P/1.2];
% Define function that will be used to fit data
% (F is a vector of fitting parameters)
f = @(F,x) F(1) .* x(:,1).^F(2) .* x(:,2).^F(3);
beta0 = [1 1 1];
mdl_original = fitnlm(X,Y,f,beta0)
% Plot actual versus predicted Y in original space
figure
plot(Y,predict(mdl_original,X),'o')
line([0 90],[0 90],'LineStyle','--','LineWidth',1,'Color','k')
axis equal
title('Fit in original space')
xlabel('Y from data')
ylabel('Y from fit')
More Answers (2)
the cyclist
on 11 Feb 2021
Edited: the cyclist
on 12 Feb 2021
I would solve this by first applying a natural log transform to both sides of the equation, which then gives you a linear equation.
% Original equation
% y = a * ( (T/298) ^m ) * ( (P/1.2) ^n )
% Transform by taking the natural log of both sides
% log(y) = log(a) + m * log(T/298) + n * log(P/1.2);
% Load the data
data = readtable("power_law_using_least_square.xlsx");
% Strip off problematic first data point
data(1,:) = [];
% Define convenient variables
logY = log(data.Y);
t = log(data.T/298);
p = log(data.P/1.2);
% Fit the transformed equation
mdl = fitlm([t p],logY)
% Plot actual versus predicted Y in log space
figure
plot(logY,predict(mdl,[t,p]),'o')
line([0 5],[0 5],'LineStyle','--','LineWidth',1,'Color','k')
axis equal
title('Fit in log space')
xlabel('log Y from data')
ylabel('log Y from fit')
% Plot actual versus predicted Y in linear space
figure
plot(exp(logY),exp(predict(mdl,[t,p])),'o')
line([0 90],[0 90],'LineStyle','--','LineWidth',1,'Color','k')
axis equal
title('Fit in original space')
xlabel('Y from data')
ylabel('Y from fit')
If you examine the resulting plots, you will see that the fit is not very good at all. I have not investigated whether this is because I made some mistake in the code, or if your model is a poor one for your data. But I figured I would post this regardless. Maybe it will help you.
Oh, also. I removed the data point for which Y == 0. Your model definitely can't fit that. I have no advice there.
9 Comments
John D'Errico
on 12 Feb 2021
Just looking at the plots shown by @the cyclist, I would come to the immediate conclusion this model will not fit that data. At least, not fit well. As such, my questions would be along the lines of why do you think that is the correct model?
John D'Errico
on 12 Feb 2021
Edited: John D'Errico
on 12 Feb 2021
Let me look more carefully at your data, as there are some thigns that REALLY bother me about the approach you wish to use, as well as your data.
TYP = [298 0 0.986923009833903
300.101235002186 43.1990254126569 1.01209349427671
301.908520257823 65.0218477665674 1.03411766816416
305.935528457621 50.083956213335 1.08445863704977
310.780047854022 48.5052713557729 1.14738484815678
317.018449674478 47.315470757813 1.23233523315124
324.648707751109 47.4190220529795 1.34245610258851
333.805405104856 49.5689966428311 1.48404007757928
344.874810493167 46.6742563932582 1.66967240034496
356.711851888339 49.793300320307 1.88676782866414
370.55927175896 50.0405809408228 2.16678946809033
385.660757806138 50.7992995854221 2.50659100806819
402.028726805153 51.0012599219203 2.9187576908191
419.372189798347 53.6314606614739 3.40958213745378
438.366194499516 53.6258762011997 4.01682007463642
458.068137769226 49.7585908621274 4.73103257070098
476.97709145931 36.2615978957431 5.5050249673172
491.282838125128 29.0057706899755 6.1531649417194
502.975369275774 17.5628384000534 6.72579346279318
510.206038314783 13.8072020943403 7.10020441887989
515.939729045497 9.79482823125674 7.40854285330424
520.03408770692 7.04103773327845 7.63507721328947
522.987452750413 4.1988379264908 7.80183167272305
524.746678534683 0.199850325276192 7.90251361049426
524.801368151412 0.199744549683599 7.90565992104962];
T = TYP(:,1);
Y = TYP(:,2);
P = TYP(:,3);
plot3(T,P,Y,'o')
view(-6,21)
box on
grid on
xlabel T
ylabel P
zlabel Y
UGH. This is not good data. You need to recognize that there appears to be a t least two points that are wholly inconsistent with the rest. At the very least, that will blow any such regression modeling out of the water.
Look at the plot! If I have said anything most frequently on this forum, it is to plot EVERYTHING! Then if you get bored, plot something else. Keep on plotting until you find somethign valuable. And think about EVERY plot. What does it tell you?
When I look at that figure, I see two points that apprear to be entirely inconsistent. One of them is the first data point, with Y==0. Then there is the third data point in your list, where y is 65. Again, this is entirely inconsistent with the others wher P and T are nearly the same sa at some other data points, yet Y is something completely different from the rest. Do you understand that this will blow ANY regression model out of the water?
So I would just exclude points 1 and 3. Even the second data point seems a bit strange, but leave it in for now.
Keeplist = setdiff(1:25,[1 3]);
plot(T(Keeplist),Y(Keeplist),'o')
plot(P(Keeplist),Y(Keeplist),'o')
Sadly, plotting the marginals, neither plot looks like any sort of power law behavior. And worse, T and P are highly correlated themselves. So my question becomes, why in the nae of god and little green apples do you think this data supports that model? My guess is you have no clue as to what model may be appropriate, so you decided to try a powerlaw fit.
Lets think about the model you chose. It was:
y = a * ( (T/298) ^m ) * ( (P/1.2) ^n )
What happens when T is zero? When P is zero? In both cases, Y must be exactly zero. So my first guess is that you created the first data point arbitrarily, hoping to force the curve through zero. That first point suspiciously has y EXACTLY at zero. But your data does NOT support anything of the sort!!!!!!!!!!
In fact, it looks like your data seems to indicate that as P and T are both small, the model should predict Y is roughly 50, maybe just a little less.
And that suggests your model might have a CHANCE in hell of being something like:
Y = c + a * ( (T/298) ^m ) * ( (P/1.2) ^n )
Here c will be roughly 50 when we estimate it. If we do that, when either P or T approaches zero, the model will have some chance of predicting the data.
So next, what might be reasonable values for n and m? This is a serious problem, because P and T are HIGHLY correlated. And the fact is, this latter model will still not fit well at all.
Again, the marginal plots I showed are simply not anything that looks remotely like any power law. You have a relationship that is virtually CONSTANT for a long way but at a specific point, it seems to drop off almost linearly. And this is completely inconsistent with any powerlaw behavior!
It is my STRONG recommendation that you reconsider what model you want to use here. The model should be based on reality, on physical principles if possible. Lacking that, you need to consider what you want to do with this model. Why are you trying to fit it? What will you do with that model?
2 Comments
the cyclist
on 12 Feb 2021
@John D'Errico, thanks so much for this deep dive. Saved me from doing all of this, which I was going to do over the weekend. :-)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!