66 views (last 30 days)

I have a set of data which needs to be grouped on the basis of the dates. Further calculations needs to be done like calculation of RV, BPV etc on a daily basis.

I tried using the code newStr = extractBefore(str,11) to extract only the dates and then try the grouping. However, since ' is present in the string, the code is not able to extract the date ('08-Jan-2015 01:33:28').

Can someone help me group this data on the basis of dates and perform further operations.

In reference to final_use variable in workspace-

The first column in the attached .mat file are the dates and time(in unix format) and the second column are the prices. I have calculated the returns and is in the third column.

This further converts the unix time into character array.

%I created a code to convert the unix time into a time-

function dn = unixtime_to_datenum( unix_time )

dn = unix_time/86400 + 719529; %# == datenum(1970,1,1)

end

%I created a code to convert the time into a string-

time = final_use(:,1)

str = datestr( unixtime_to_datenum( time ) )

Cris LaPierre
on 30 Dec 2019

I'm not sure what version of MATLAB you are using, but why not use the datetime data type? This line of code will convert the first column of data to datetimes, removing the need to create a separate function for this.

time = datetime(final_use(:,1),'ConvertFrom','epochtime',"Epoch",'1970-01-01')

Once you have the data as datetimes, you can then compute summary statistics grouping the data by specific time intervals (minutes, hours, days, weeks, months, etc). Use the groupsummary function and specify the desired groupbin. For example, the mean price for each day can be computed this way (I convert the matrix to a table first)

load final_data_work.mat

time = datetime(final_use(:,1),'ConvertFrom','epochtime',"Epoch",'1970-01-01')

price = final_use(:,2);

returns = final_use(:,3);

dataTbl = table(time,price,returns);

summaryTbl = groupsummary(dataTbl,"time","day",'mean',"price")

Cris LaPierre
on 2 Jan 2020

It is still possible to use your own equations. However, when performing calculations on groups, the result must be a single value for each group. This means the function can only return a single output variable containing a single value. This means you need to create a function for each value you want to create. You can then use function handles to have those values computed for each group.

Some of your functions use the resutls of previous calculations, so they can only be computed once their dependent variables have been computed. I might set it up like this.

% Set up the data

load final_data_work.mat

time = datetime(final_use(:,1),'ConvertFrom','posixtime')

price = final_use(:,2);

returns = final_use(:,3);

dataTbl = table(time,price,returns);

% Define constants

alpha =0.999;

k=4/3;

mu=(2.^(k./2)).*gamma((k+1)./2)./gamma(0.5);

% Define functions

RV = @(r) sum(r.^2);

BV = @(r) (pi./2).*(length(r)./(length(r)-1)).*sum(abs(r(2:end)).*abs(r(1:end-1)));

TP = @(r) length(r).*mu.^(-3).*(length(r)./(length(r)-2)).*sum((abs(r(1:end-2)).^(k)).*(abs(r(2:end-1)).^(k)).*(abs(r(3:end)).^(k)));

% Compute dependent variables

summaryTbl = groupsummary(dataTbl,"time","day",{RV,BV,TP,'sum'},"returns");

summaryTbl.Properties.VariableNames(3:end) = ["RV" "BV" "TP" "RT"];

% Compute remaining values

summaryTbl.RJ = (summaryTbl.RV-summaryTbl.BV)./summaryTbl.RV;

summaryTbl.ZJ = summaryTbl.RJ./(sqrt((((pi./2).^2)+pi-5).*(1./summaryTbl.GroupCount).*max(1,summaryTbl.TP./(summaryTbl.BV.^2))));

summaryTbl.JT = sign(summaryTbl.RT).*sqrt((summaryTbl.RV-summaryTbl.BV).*(summaryTbl.ZJ>=norminv(alpha)))

Walter Roberson
on 2 Jan 2020

There is a trick: you can return a cell array. That qualifies as "a single output variable containing a single value."

If you return a row vector then afterwards you can cell2mat() and then array2table() if you want a table() of results.

Sign in to comment.

Sign in to answer this question.

Opportunities for recent engineering grads.

Apply Today
## 0 Comments

Sign in to comment.