Use splittapply with division

2 views (last 30 days)
Luca
Luca on 27 Sep 2023
Edited: Luca on 28 Sep 2023
Hi,
I have data of total 76 stocks over a year. I would like to normalize the data of each stock by dividing the whole stock time series by its first entry.
With only one stock it works like that:
D1990 = D(D.year==1990 & D.gvkey==15497,:);
D1990.pricenorm = D1990{:,"priceadj"}./D1990{1,"priceadj"};
The data looks like this.
where gvkey is the unique stock ID and priceadj is the price of the stock each day.
and the athohr variables are just some date variables.
So my idea was to do it with splitapply but unfortunately I don't get it to work.
[group1, ID] = findgroups(D1990.gvkey);
x = splitapply(@(x,y) x./y, D1990{:,"priceadj"}, D1990{1,"priceadj"} group1);
I think using the ID as group doesn't work and I'm also not sure if I use the function in splitapply correctly.
I also attached the acutal file.
Does someone know how to fix it?
Thank you in advance.

Accepted Answer

Mario Malic
Mario Malic on 27 Sep 2023
Hey, is this what you are looking for?
load D1990.mat
[group1, ID] = findgroups(D1990new.gvkey);
y = splitapply(@(x) {x./x(1)}, D1990new.priceadj, group1)
D1990new.priceadjNorm = cell2mat(y)

More Answers (1)

dpb
dpb on 27 Sep 2023
Edited: dpb on 28 Sep 2023
@Mario Malic fixed the problem w/ splitapply; you only wanted to divide by the first element of the group (which is a scalar so don't need the "dot" divide operator here -- doesn't hurt anything to use and is probably best practice to do so, but isn't required here.
An alternative to illustrate some other newer features of tables...
load D1990
tD=D1990new; % get a short name for convenience
clear D1990new
tD=addvars(tD,cell2mat(rowfun(@(p)p/p(1),tD,'GroupingVariables',{'gvkey'},'InputVariables',{'priceadj'}, ...
'OutputVariableName',{'pricenorm'},'OutputFormat','cell')), ...
'After','priceadj','NewVariableNames',{'pricenorm'});
format bank
head(tD)
gvkey date month year monthyear monthyear_1 priceadj pricenorm ________ ___________ _____ _______ _________ ___________ ________ _________ 15497.00 30-Jan-1990 1.00 1990.00 Jan-1990 Jan-1990 1908.18 1.00 15497.00 13-Feb-1990 2.00 1990.00 Feb-1990 Feb-1990 1908.18 1.00 15497.00 23-Feb-1990 2.00 1990.00 Feb-1990 Feb-1990 1799.55 0.94 15497.00 26-Feb-1990 2.00 1990.00 Feb-1990 Feb-1990 1804.27 0.95 15497.00 28-Feb-1990 2.00 1990.00 Feb-1990 Feb-1990 1799.55 0.94 15497.00 01-Mar-1990 3.00 1990.00 Mar-1990 Mar-1990 1794.82 0.94 15497.00 06-Mar-1990 3.00 1990.00 Mar-1990 Mar-1990 1790.10 0.94 15497.00 07-Mar-1990 3.00 1990.00 Mar-1990 Mar-1990 1794.82 0.94
  1 Comment
Luca
Luca on 28 Sep 2023
Edited: Luca on 28 Sep 2023
Thank you very much this works too. I wasn't aware of the function addvars its cool to learn something new.

Sign in to comment.

Categories

Find more on Programming in Help Center and File Exchange

Tags

Products


Release

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!