Tutorial on the selection of the value for the integral scale

In this tutorial, we will illustrate how the choice of the value for the number of integral scales, which is a model parameter that is assumed to be provided by the user, can be guided by means of the maximum (log-)posterior value attained by the model.

The number of integral scales is defined as the data length divided by the decorrelation length.

Contents

Preliminaries and input data

The data under analysis is given by a realization of 15 multifractal time series (N=2^14) with controlled multifractal properties and integral scale. For the first 5 time series, (c1,c2)=(0.7,-0.02), for the next 5 time series, (c1,c2)=(0.7,-0.04), and for the last 5 time series, (c1,c2)=(0.7,-0.01). The number of integral scales is 2^7 for all series (i.e., the decorrelation length is N/2^7=2^7 samples).

% Add the path to toolbox
clear all; clc;
pth=which('est_bayes_mvmfa');if ispc; ii=strfind(pth,'\');else;ii=strfind(pth,'/');end;pth=pth(1:ii(end));addpath(genpath(pth));
% Load synthetic data and theoretical values for c2
load('mrw14_IS.mat');

% Plot the data
zdata=zscore(data')/4;
figure(10);clf;hold on;set(gcf,'position',[10 500 500 400]);
for k=1:size(data,1);LS{k}=num2str(k);plot(zdata(:,k)+k-zdata(1,k));end
grid on;set(gca,'xlim',[1 N]);set(gca,'ylim',[0 16.2]);title('The data set: ensemble of 15 time series');ylabel('time series');%legend(LS);
set(gca,'ytick',[1:15]);set(gca,'yticklabel',[1:15]);

% Multifractal analysis parameters
Nwt = 2;    % number of vanishing moments of wavelet (Daubechies)
gamint = 0; % fractional integration parameter for wavelet leaders
j1=2;       % scaling range lower cutoff
j2=6;       % scaling range upper cutoff

% Data handling parameters
sw=N;  % time window size (here, the entire time series)
dsw=0*sw;       % overlap of windows (here, none)

Scaling and correlation analysis

It can be useful to investigate the linear behavior with scales j of the first and second log-cumulants C1(j) and C2(j). For multifractal cascade processes with a decorrelation length shorter than the data length, C2(j) should saturate at 0 as soon as j>=J, where 2^J is the decorrelation length. As can be seen in the following plot, this is however far from obvious to decide on (remember that the decorrelation length of the data analyzed here is 2^7, yet it is hard to infer this value from the behavior of C2(j)).

% compute mean and variances of log-leaders for all scales (for scaling
% investigation only)
[lxft,pd0,data0,pdata] = preproc_data_mvmfa(data,sw,dsw,1,log2(N)-3,Nwt,gamint);

% plot C1(j) and C2(j) as a function of scale j (overlay of all data
% components)
figure(1);clf; set(gcf,'position',[10 500 1000 300]);
subplot(121);plot(pd0.mV','d-');grid on; xlabel('j = log2(scale)'); ylabel('C1(j)');
subplot(122);plot(pd0.V','d-');grid on; xlabel('j = log2(scale)'); ylabel('C2(j)');

For multifractal cascades, the integral scale (i.e., decorrelation length) is also revealed by the lag for which the autocovariance of the logarithm of the magnitude of the data increments drops to zero. In our example, visual inspection indeed suggests a decorrelation length value somehere between 2^6 and 2^8.

% compute auto-covariance of log-increments of the time series
for i=1:length(CC2); xx(:,i)=xcov(log(abs(diff(data(i,:))))); end
xx=xx(floor(end/2)+1:end,:); tx=1:length(xx);
% plot autocovariance (overlay of all data components)
figure(2);clf;set(gcf,'position',[1000 500 500 300]);
plot(log2(tx),xx);grid on;xlabel('log_{2}(Lag)'); ylabel('Cov[ log( |diff(X)| ) ]');

Bayesian model fit analysis

% Data cutting and computation of DWT and DFT for selected scaling range
[lxft,param_data,data0,pdata] = preproc_data_mvmfa(data,sw,dsw,j1,j2,Nwt,gamint); % Data cutting and computation of DWT and DFT

% evaluation of value of log-posterior for different integral scale values
ISL=2.^([2:9]);
for ini=1:length(ISL);
    NII=ISL(ini);     % number of integral scales (default: 1)
    [param_model] = init_model_mvmfa(param_data,'NIScale',NII,'PostVal',1,'Nbi',100,'Nmc',200); % Computation of fixed terms of the spectral density, reparametrization + initialization of prior parameters
    REGU=[0 0]; [eec2,Tc2,ec20tp,Tc20] = est_bayes_mvmfa(lxft,param_model,REGU);
    PV(:,ini)=Tc2.Pt;
end

% find maximizer of log-posterior
[mlogP,ilp] = max(mean(PV));
Est_IS=ISL(ilp),Est_Decorr_Len=N/ISL(ilp)
figure(3);clf;set(gcf,'position',[500 100 500 300]);
plot(log2(ISL),mean(PV),'ro-');hold on; plot(log2(ISL(ilp)),mean(PV(:,ilp)),'b*');hold off;grid on;
set(gca,'xtick',log2(ISL));set(gca,'xticklabel',ISL);xlabel('# integral scales IS');ylabel('log-posterior value');
Est_IS =
    64
Est_Decorr_Len =
   256

Comparison of estimates obtained for different integral scale values

Finally, compute the Bayesian estimates using the selected integral scale value and compare them to Bayesian estimates obtained with far too small and far too large values for the integral scale. The plot also shows the theoretical value for c2, and indicates that the integral scale value selected above is indeed more pertinent than arbitrarily selected (small or large) values.

% compute linear fit (as a baseline reference)
clear ec2; [ec2{1},ec20{1},ec1{1},ec10{1}] = linfit_cp(param_data,0);

% Compute bayesian estimators for different integral scale values
% with estimated number of integral scales
NII=ISL(ilp);
[param_model] = init_model_mvmfa(param_data,'NIScale',NII,'Nbi',500,'Nmc',1500); % Initialize model
    REGU=[0 0]; [ec2{2}(:,1)] = est_bayes_mvmfa(lxft,param_model,REGU); % Compute Bayesian estimators
% with too small number of integral scales
NII=ISL(ilp-1);
[param_model] = init_model_mvmfa(param_data,'NIScale',NII,'Nbi',500,'Nmc',1500); % Initialize model
    REGU=[0 0]; [ec2{2}(:,2)] = est_bayes_mvmfa(lxft,param_model,REGU); % Compute Bayesian estimators
% with too large number of integral scales
NII=ISL(ilp+1);
[param_model] = init_model_mvmfa(param_data,'NIScale',NII,'Nbi',500,'Nmc',1500); % Initialize model
    REGU=[0 0]; [ec2{2}(:,3)] = est_bayes_mvmfa(lxft,param_model,REGU); % Compute Bayesian estimators

% plot estimates
figure(4);clf;tt=1:length(squeeze(CC2(:,:,1)));set(gcf,'position',[10 100 500 300]);
plot(tt,squeeze(CC2(:,:,1)),'r--',tt,squeeze(ec2{1}(:,:,1)),'mx-',tt,squeeze(ec2{2}(:,1)),'ko-',...
    tt,squeeze(ec2{2}(:,2)),'ro-',tt,squeeze(ec2{2}(:,3)),'go-');grid on;set(gca,'xlim',[1 15]);
legend('theo','LF',['IG #is=',num2str(ISL(ilp))],['IG #is=',num2str(ISL(ilp-1))],...
    ['IG #is=',num2str(ISL(ilp+1))]); xlabel('time series'); title('Estimates for c2');

The root-mean-squared-error values for estimates for c2 confirm that the selected value for the integral scale is pertinent.

% compute and display RMSE values
RMSE(1)=sqrt(mean((ec2{1}(:,1)-CC2').^2)); for k=2:4; RMSE(k)=sqrt(mean((ec2{2}(:,k-1)-CC2').^2)); end
disp(' RMSE values:');
disp(['LF          IG IS=',num2str(ISL(ilp)),'    IG IS=',num2str(ISL(ilp-1)),'     IG IS=',num2str(ISL(ilp+1))]);
disp(num2str(round(1e4*RMSE)/1e4))
 RMSE values:
LF          IG IS=64    IG IS=32     IG IS=128
0.0125      0.0026      0.0041      0.0027