# crosscorr

Sample cross-correlation

## Description

example

[xcf,lags] = crosscorr(y1,y2) returns the sample cross-correlation function (XCF) xcf and associated lags lags between the univariate time series y1 and y2.

example

XCFTbl = crosscorr(Tbl) returns the table XCFTbl containing variables for the sample XCF and associated lags of the last two variables in the input table or timetable Tbl. To select different variables in Tbl, for which to compute the XCF, use the DataVariables name-value argument.

example

[___,bounds] = crosscorr(___) uses any input-argument combination in the previous syntaxes, and returns the output-argument combination for the corresponding input arguments and the approximate upper and lower confidence bounds bounds on the XCF.

example

[___] = crosscorr(___,Name=Value) uses additional options specified by one or more name-value arguments. For example, crosscorr(Tbl,DataVariables=["RGDP" "CPI"],NumLags=10,NumSTD=1.96) returns the sample XCF for lags -10 through 10 of the table variables "RGDP" and "CPI" in Tbl and 95% confidence bounds.

example

crosscorr(___) plots the sample XCF between the input series with confidence bounds.

crosscorr(ax,___) plots on the axes specified by ax instead of the current axes (gca). ax can precede any of the input argument combinations in the previous syntaxes.

[___,h] = crosscorr(___) plots the sample XCF between the input series and additionally returns handles to plotted graphics objects. Use elements of h to modify properties of the plot after you create it.

## Examples

collapse all

Compute the XCF between two univariate time series. Input the time series data as numeric vectors.

Load the equity index data Data_EquityIdx.mat. The variable Data is a 3028-by-2 matrix of daily closing prices from the NASDAQ and NYSE composite indices. Plot the two series.

yyaxis left
dt = datetime(dates,ConvertFrom="datenum");
plot(dt,Data(:,1))
ylabel("NASDAQ")
yyaxis right
plot(dt,Data(:,2))
ylabel("NYSE")
title("Daily Closing Prices, 1990-2001")

The series exhibit exponential growth.

Compute the returns of each series.

Ret = price2ret(Data);

Ret is a 3027-by-2 series of returns; it has one less observation than Data.

Compute the XCF between the NASDAQ and NYSE returns, and return the associated lags.

rnasdaq = Ret(:,1);
rnyse = Ret(:,2);
[xcf,lags] = crosscorr(rnasdaq,rnyse);

xcf and lags are 41-by-1 vectors that describe the XCF.

Display several values of the XCF.

XCF = [xcf lags];
XCF([1:3 20:22 end-2:end],:)
ans = 9×2

-0.0108  -20.0000
0.0186  -19.0000
-0.0002  -18.0000
0.0345   -1.0000
0.7080         0
0.0651    1.0000
-0.0461   18.0000
0.0010   19.0000
0.0015   20.0000

The correlation between the current NASDAQ return and the NYSE return from 20 days before is xcf(1) = -0.0108. The correlation between the NASDAQ and NYSE returns is xcf(21) = 0.7080. The correlation between the NASDAQ return from 20 days ago and the current NYSE return is xcf(41) = 0.0015.

Compute the XCF between two univariate time series, which are two variables in a table.

Load the equity index data Data_EquityIdx.mat. The variable DataTable is a 3028-by-2 table of daily closing prices from the NYSE and NASDAQ composite indices, which are stored in the variables NYSE and NASDAQ.

DataTable.Properties.VariableNames
ans = 1x2 cell
{'NYSE'}    {'NASDAQ'}

Compute the returns of the series. Store the results in a new table.

RetTbl = price2ret(DataTable);
Tick    Interval       NYSE         NASDAQ
____    ________    __________    __________

2         1        -0.0010106     0.0034122
3         1        -0.0076633    -0.0032816
4         1        -0.0084415    -0.0025501
5         1         0.0035387     0.0010688
6         1         -0.010188    -0.0042382
7         1        -0.0063818     -0.013378
8         1         0.0034295    -0.0040909
9         1         -0.023407     -0.020573

RetTbl is a 3027-by-4 table containing the returns of the indices, ticks (days by default), and time intervals between successive prices.

Compute the XCF between the NASDAQ and NYSE return series.

XCFTbl = crosscorr(RetTbl)
XCFTbl=41×2 table
Lags        XCF
____    ___________

-20       -0.010809
-19        0.018571
-18     -0.00016185
-17       -0.020271
-16       -0.029353
-15      0.00023188
-14      -0.0080616
-13        0.041498
-12        0.078821
-11       -0.013793
-10       0.0076655
-9         0.01763
-8      -0.0011033
-7       -0.011457
-6       -0.016523
-5       -0.046749
⋮

crosscorr returns the results in the table XCFTbl, where variables correspond to the XCF (XCF) and associated lags (Lags).

By default, crosscorr computes the XCF of the two variables in the table. To select variables from an input table, set the DataVariables option.

Consider the equity index series in Compute XCF of Table Variable.

Load the NYSE and NASDAQ closing price series in Data_EquityIdx.mat and preprocess the series. Compute the XCF and return the XCF confidence bounds.

RetTbl = price2ret(DataTable);
[XCFTbl,bounds] = crosscorr(RetTbl)
XCFTbl=41×2 table
Lags        XCF
____    ___________

-20       -0.010809
-19        0.018571
-18     -0.00016185
-17       -0.020271
-16       -0.029353
-15      0.00023188
-14      -0.0080616
-13        0.041498
-12        0.078821
-11       -0.013793
-10       0.0076655
-9         0.01763
-8      -0.0011033
-7       -0.011457
-6       -0.016523
-5       -0.046749
⋮

bounds = 2×1

0.0364
-0.0364

Assuming the NYSE and NASDAQ return series are uncorrelated, an approximate 95.4% confidence interval on the XCF is (-0.0364, 0.0364).

Generate 100 random variates from a Gaussian distribution with mean 0 and variance 1.

rng(3); % For reproducibility
x = randn(100,1);

Create a 4-period delayed version of x.

y = lagmatrix(x,4);

Plot the XCF between x and y. Because lagmatrix prepends lagged series with NaN values and crosscorr does not support NaN values, start the series at observation 5.

crosscorr(x(5:end),y(5:end))

The upper and lower confidence bounds are the horizontal lines in the XCF plot. By design, the XCF peaks at lag 4.

Load the currency exchange rates data set Data_FXRates.mat. The table DataTable contains daily exchange rates of several countries, relative to the US dollar from 1980 through 1998 (with omissions).

dt = datetime(dates,ConvertFrom="datenum");

Plot the UK pound and French franc exchange rates.

yyaxis left
plot(dt,DataTable.GBP)
ylabel("UK Pound/\$")
yyaxis right
plot(dt,DataTable.FRF)
ylabel("French Franc/\$")

The series appear to be correlated.

Stabilize all series in the table by computing the first difference.

DiffDT = varfun(@diff,DataTable);
DiffDT.Properties.VariableNames = DataTable.Properties.VariableNames;

Determine whether lags of one series are associated with the other series by computing the XCF between the daily changes in the UK pound and French franc exchange rates.

figure
crosscorr(DiffDT,DataVariables=["GBP" "FRF"]);

The series have a high contemporaneous correlation, but all other cross-correlations are either insignificant or below 0.1.

Specify the AR(1) model for the first series

${y}_{1t}=2+0.3{y}_{1t-1}+{\epsilon }_{t},$

where ${\epsilon }_{t}$ is Gaussian with mean 0 and variance 1.

MdlY1 = arima(AR=0.3,Constant=2,Variance=1);

MdlY1 is a fully specified arima object representing the AR(1) model.

Simulate data from the AR(1) model.

rng(3); % For reproducibility
T = 1000;
y1 = simulate(MdlY1,T);

Simulate standard Gaussian variates for the second series; induce correlation at lag 36.

y2 = [randn(36,1); y1(1:end-36) + randn(T-36,1)*0.1];

Plot the XCF by using the default settings.

crosscorr(y1,y2)

All correlations in the plot are within the 2-standard-error confidence bounds. Therefore, none are significant.

Plot the XCF for 60 lags on both sides of lag 0. Specify 3 standard errors for the confidence bounds.

crosscorr(y1,y2,NumLags=60,NumSTD=3)

The plot shows significant correlations at and around lag 36.

## Input Arguments

collapse all

Univariate time series data, specified as a numeric vector of length T1.

Data Types: double

Univariate time series data, specified as a numeric vector of length T2.

Data Types: double

Time series data, specified as a table or timetable with T rows. Each row of Tbl contains contemporaneous observations of all variables.

Specify the two input series (variables) by using the DataVariables argument. The selected variables must be numeric.

Axes on which to plot, specified as an Axes object.

By default, crosscorr plots to the current axes (gca).

Note

Missing observations, specified by NaN entries in the input series, result in a NaN-valued XCF.

### Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: crosscorr(Tbl,DataVariables=["RGDP" "CPI"],NumLags=10,NumSTD=1.96) returns the sample XCF for lags -10 through 10 of the table variables "RGDP" and "CPI" in Tbl and 95% confidence bounds.

Number of lags in the sample XCF, specified as a positive integer. crosscorr uses lags 0, ±1, ±2, …, ±NumLags to compute the sample XCF.

If you supply y1 and y2, the default is min(20, min(T1,T2) – 1)). If you supply Tbl, the default is min(20, T – 1).

Example: crosscorr(y1,y2,NumLags=10) plots the sample XCF between y1 and y2 for lags –10 through 10.

Data Types: double

Number of standard errors in the confidence bounds, specified as a nonnegative scalar. The confidence bounds are 0 ± NumSTD*$\stackrel{^}{\sigma }$, where $\stackrel{^}{\sigma }$ is the estimated standard error of the sample cross-correlation between the input series assuming the series are uncorrelated.

The default yields approximate 95% confidence bounds.

Example: crosscorr(y1,y2,NumSTD=1.5) plots the XCF of y1 and y2 with confidence bounds 1.5 standard errors away from 0.

Data Types: double

Two variables in Tbl for which crosscorr computes the XCF, specified as a string vector or cell vector of character vectors containing two variable names in Tbl.Properties.VariableNames, or an integer or logical vector representing the indices of two names. The selected variables must be numeric.

Example: DataVariables=["GDP" "CPI"]

Example: DataVariables=[true true false false] or DataVariables=[1 2] selects the first and second table variables.

Data Types: double | logical | char | string

## Output Arguments

collapse all

Sample XCF between the input time series, returned as a numeric vector of length 2*NumLags + 1.

The elements of xcf correspond to the elements of lags. The center element is the lag 0 cross-correlation. crosscorr returns xcf only when you supply the inputs y1 and y2.

XCF lags, returned as a numeric vector with elements (-NumLags):NumLags having the same orientation as y1. crosscorr returns lags only when you supply the inputs y1 and y2.

Sample XCF, returned as a table with variables for the outputs xcf and lags. crosscorr returns XCFTbl only when you supply the input Tbl.

Approximate upper and lower XCF confidence bounds assuming the input series are uncorrelated, returned as a two-element numeric vector. The NumSTD option specifies the number of standard errors from 0 in the confidence bounds.

Handles to plotted graphics objects, returned as a graphics array. h contains unique plot identifiers, which you can use to query or modify properties of the plot.

collapse all

### Cross-Correlation Function

The cross-correlation function (XCF) measures the similarity between a time series and lagged versions of another time series as a function of the lag.

Consider the time series y1,t and y2,t and lags k = 0, ±1, ±2, …. For data pairs (y1,1,y2,1), (y1,2,y2,2), …, (y1,T,y2,T), an estimate of the lag k cross-covariance is

${c}_{{y}_{1}{y}_{2}}\left(k\right)=\left\{\begin{array}{c}\frac{1}{T}\sum _{t=1}^{T-k}\left({y}_{1,t}-{\overline{y}}_{1}\right)\left({y}_{2,t+k}-{\overline{y}}_{2}\right);\text{\hspace{0.17em}}k=0,1,2,\dots \\ \frac{1}{T}\sum _{t=1}^{T+k}\left({y}_{2,t}-{\overline{y}}_{2}\right)\left({y}_{1,t-k}-{\overline{y}}_{1}\right);\text{\hspace{0.17em}}k=0,-1,-2,\dots \end{array},$

where ${\overline{y}}_{1}$ and ${\overline{y}}_{2}$ are the sample means of the series.

The sample standard deviations of the series are:

• ${s}_{{y}_{1}}=\sqrt{{c}_{{y}_{1}{y}_{1}}\left(0\right)},$ where ${c}_{{y}_{1}{y}_{1}}\left(0\right)=Var\left({y}_{1}\right).$

• ${s}_{{y}_{2}}=\sqrt{{c}_{{y}_{2}{y}_{2}}\left(0\right)},$ where ${c}_{{y}_{2}{y}_{2}}\left(0\right)=Var\left({y}_{2}\right).$

An estimate of the cross-correlation is

${r}_{{y}_{1}{y}_{2}}\left(k\right)=\frac{{c}_{{y}_{1}{y}_{2}}\left(k\right)}{{s}_{{y}_{1}}{s}_{{y}_{2}}};\text{\hspace{0.17em}}k=0,±1,±2,\dots \text{.}$

## Algorithms

• If y1 and y2 have different lengths, crosscorr appends enough zeros to the end of the shorter vector to make both vectors the same size.

• crosscorr uses a Fourier transform (fft) to compute the XCF in the frequency domain, and then crosscorr converts back to the time domain using an inverse Fourier transform (ifft).

• NaN values in the input series result in NaN values in the output XCF. Unlike autocorr and parcorr, crosscorr does not treat NaN values as missing completely at random. Whereas autocorr and parcorr compute coefficients in the time domain, crosscorr uses fft and ifft to compute coefficients in the frequency domain. Therefore, missing data treatments follow fft and ifft defaults.

• crosscorr plots the XCF when you do not request any output or when you request the fourth output.

## References

[1] Box, George E. P., Gwilym M. Jenkins, and Gregory C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

## Version History

Introduced before R2006a