Discrete and Continuous Models and Applied Computational ScienceDiscrete and Continuous Models and Applied Computational Science2658-46702658-7149Peoples' Friendship University of Russia named after Patrice Lumumba (RUDN University)2291610.22363/2658-4670-2019-27-4-343-354Research ArticleVine copulas structures modeling on Russian stock marketShchetininEugeny Yu.<p>Doctor of Physical and Mathematical Sciences, lecturer of Department of Data Analysis, Decision Making and Financial Technologies</p>riviera-molto@mail.ruFinancial University under the Government of Russian Federation1512201927434335419022020Copyright © 2019, Shchetinin E.Y.2019<p>Pair-copula constructions have proven to be a useful tool in statistical modeling, particularly in the field of finance. The copula-based approach can be used to choose a model that describes the dependence structure and marginal behaviour of the data in efficient way, but is usually applied to pairs of securities. In contrast, vine copulas provide greater flexibility and permit the modeling of complex dependency patterns using the rich variety of bivariate copulas which may be arranged and analysed in a tree structure. However, the number of possible configurations of a vine copula grows exponentially as the number of variables increases, making model selection a major challenge in development. So, to learn the best possible model, one has to identify the best possible structure, which necessitates identifying the connections between the variables and selecting between the multiple bivariate copulas for each pair in the structure. This paper features the use of regular vine copulas in analysis of the co-dependencies of four major Russian Stock Market securities such as Gazprom, Sberbank, Rosneft and FGC UES, represented by the RTS index. For these stocks the D-vine structures of bivariate copulas were constructed, which models are described by Gumbel, Student, BB1and BB7 copulas, and estimates of their parameters were obtained. Computer simulations showed a high accuracy of the approximation of the explored data by D-vine structure of bivariate copulas and the effectiveness of our approach in general.</p>copulamultivariate modelsdependence structurevinessecuritiesфинансовый анализценные бумагимногомерные структуры статистических связейкопулывьющиеся копулы<p>1. Introduction In the field of financial analysis, finding new useful models and improving the existing ones is a constant struggle. Finding an appropriate multivariate model that efficiently describes the dependence structure as well as marginal behavior of the data being analyzed can be a very challenging task, especially in the case of higher dimensions. The approach that relies on copulas tends to outperform other methods when it comes to financial analysis, for example modeling financial returns. Usually the Student -dimensional copula is a good choice for financial data of various kinds [1], and as such deserves special attention. Of course, generally speaking, thorough analysis is needed for the best results - especially if the data being analysed has different behaviour in the tails, in which case the Student copula might not capture the dependence structure very well. We will be discussing pairwise model into bivariate copulas as laid out by Aas [2]. This approach will let us easily track the parameters relevant to the tail dependence. In order to find the most appropriate approach for our specific case, we will rely on the detailed comparison and overview of different approaches by Berg [3]. The relatively recent concept of vines, introduced by T. Bedford and R. M. Cooke [4], is very relevant to pairwise decomposition of multivariate distributions. Vines essentially a subclass of trees that can be used to efficiently represent a pairwise decomposition. We will focus primarily on -vines and canonical vines [5], [6]. Our main source for the elements of copula theory is R. B. Nelsen, H. Joe [7]-[9]. 2. Basics of copula theory Deflnition (pair-copula) A pair-copula or simply copula is a function ∶ [0, 1]2 [0, 1] that satisfies the following properties: For any , [0, 1] 1) (, 0) = (0, ) = 0; 2) (, 1) = , (1, ) = 1. For any 1, 2, 1, 2 [0, 1] such that 1 ⩽ 2 and 1 ⩽ 2 3) (1, 1) - (1, 2) - (2, 1) + (2, 2) ⩾ 0. One of the most important theorems of copula theory is Sklars Theorem. In terms of probability theory, it states that any joint distribution function can be can be written in terms of marginal (univariate) distribution functions and a copula function that can describe the dependence structure between the random variables. Sklars Theorem. Let and be random variables with distribution functions and , respectively, and let be their joint distibution function. Then there exists a copula ∶ [0, 1]2 [0, 1] such that for any , ℝ the following equation is true: (, ) = ( (), ()). (1) If and are continuous, then is unique. If not, then is unique only on RanFxRanG (here RanF is the range of and RanG is the range of ). Conversely, if is a copula and and are distribution functions of and , respectively, then , defined by (1), is a joint distribution function for the random variables and , and and are marginal distribution functions for and , respectively. It is not hard to describe the -dimensional case, as well. But first, we have to define the notions of -Box and the -volume of an -Box and discuss notation. Let us use the following notation: 1 2 ℝ a = ( , , , ) b , = (1, 2, , ) ℝ , a ⩽ b means ⩽ for all from 1 to . When a ⩽ b we will use the following notation: [a,b] = [1, 1] [2, 2] [, ]. The construction above is called the -box. The vectors of the type c = (1, 2, , ) where equals or for all are called the vertices of the -box. The notion of the -volume of the -box, [a,b] , is discussed in [10], [11]. Definition (n-copula) An n-copula is a function ∶ [0, 1] [0, 1] that satisfies the following properties: For any u = (1, 2, , ) in [0, 1] 1) (u) = 0 if any = 0. 2) (u) = if all the coordinates except are equal to 0. For any a, b [0, 1] such that a ⩽ b 3) [a,b] ⩾ 0. The -dimensional version of Sklars theorem is discussed in Nelsen [7], and conditional copulas are discussed in Patton [12]. 3. Decomposition of a multivariate distribution function using pair-copula constructions The general product rule (also called the chain rule of probability) allows us to decompose a multivariate density function in the following, non-unique way: 12.. = 12|13|12 |12-1. (2) If we assume that is strictly continuous and use the definition of a copula and Sklars Theorem, we get 12.. = 1212 . (3) To get to the pair-copula decomposition we will also have to use the useful factorizations of this type: 2|1 = 1212 1 = 122. (4) 123 123|1 23|1 2|13|123|1 3|12 = 12 = 2|1 1 = = 2|1 2|1 = 23|13|1 = 23|1133. (5) Now lets apply (2), (3), (4) and (5) to a 3-dimensional density function to get a pair-copula decomposition: 3 123 = 12|13|12 = 112223|1133 = 121323|1 . (6) =1 If we pick another conditioning variable we get another decomposition, for example 3 123 = 12|13|12 = 112213|2233 = 1213|223 . (7) =1 The number of possible pair-copula decompositions for a 3-variable density function is 24 [13], [14] and this number rises rapidly with the number of dimensions, which makes it very complicated to find the decomposition that best preserves the known information about the dependence structure. The concept of vines is very useful in this regard. 4. The concept of vines Vines are a concept first introduced by Bedford and Cooke [4]. A vine is a sequence of trees {} in which the edges of are the nodes of +1. Each vine is a representation of a particular way of decomposing a multivariate distribution. The two kinds of common vines that we will use in our work are canonical vines and -vines. Different types of vines represent different types of dependency structures. A canonical vine corresponds to the case where one main variable interacts with all the others, while in the case of a -vine there is no such main variable. This idea is represented in the illustrations provided in Figures 1 and 2. 12 T1 1 2 13 3 15 5 14 4 T1 1 12 2 23 3 34 4 45 5 13|2 24|3 35|4 T2 12 23|1 13 24|1 25|1 15 14 T2 12 T3 132 23 14|23 242 34 45 25|34 354 T3 23|1 34|12 24|1 45|123 35|12 25|1 T4 1423 15|234 2534 T4 34|12 35|12 Figure 1. C-vine Figure 2. D-vine The following general formulas give us the expressions for the decomposition of an -dimensional density function using the -vine and the canonical vine: -1 - D-vine: 12 = ,+|+1,,+-1. (8) =1 =1 =1 -1 - Canonical vine: 12 = ,+|1,,-1. (9) =1 =1 =1 Each edge in each of the trees corresponds to a pair-copula, the density of which is used as one of the multipliers of the pair-copula construction, as we can see in (8) and (9). The first tree, 1, should be constructed in a way that best represents the supposed dependence structure of the variables. Alternative constructions may involve using the copula parameter estimations to get insight into the dependence structure - for example, we could assign a Student-t topula to all the pairs and, knowing that a low number of df indicates strong dependence, could construct a tree that represents that dependence structure. Algorithm 1. Sequential algorithm Input: Data (1, , , = 1, , (realization of i.i.d. random vectors). Output: R-vine copula specification, i.e., , . 1: Calculate the empirical Kendalls tau , for all possible variable pairs {, }, 1 ⩽ ⩽ . 2: Select the spanning tree that maximizes the sum of absolute empirical Kendalls taus, i.e., max ={,}in spanning tree ∥,∥ . 3: For each edge {, } in the selected spanning tree, select a copula and estimate the corresponding parameter(s). Then transform |(|) and |(|), = 1, , , using the fitted copula ̂ (see (2)). 4: for = 2, , - 1 do {Iteration over the trees} 5: Calculate the empirical Kendalls tau ,| for all conditional variable pairs {, |} that can be part of tree , i.e. all edges fulfilling the proximity condition (see Definition 2.1). 6: Among these edges, select the spanning tree that maximizes the sum of absolute empirical Kendalls taus, i.e., max ={,|}in spanning tree |,||. 7: For each edge {, |} in the selected spanning tree, select a conditional copula and estimate the corresponding parameter(s). Then transform |( | ,x ) and |( | ,x ), = 1, , , using the fitted copula ̂ | (see (2)). 8: end for 5. Numerical experiment: choosing the right vine structure We will now apply the theory and methods discussed above to the analysis, modeling and visualization of the returns of four major Russian companies. Our data-set consists of the log-returns of Gazprom, Sberbank, Rosneft and FGC UES from 06.06.2014 to 06.06.2018. We will use the VineCopula package for the R programming language for most of our computational needs [15]. Our main goal is to build a model that best represents core features of our datas dependency structure. We will use the sequential method [13] with Akaikes criterion [11], [16]-[18] (to determinine the most appropriate copula families) and one of the versions of Prims algorithm (to determine maximum spanning trees [19], [20]) to ultimately determine and specify the most appropriate vine structure. We have provided the results below. Figure 3 illustrates the D-vine structure of our model. T1 1 T2 2 12 13 14 23|1 13 12 3 15 5 4 25|1 15 T3 23|1 34|12 24|1 24|1 45|123 14 35|12 25|1 T4 34|12 35|12 Figure 3. D-vine structure of our model We also need to verify our model. The verification process involves drawing observations from the vine and comparing the empirical values of Spearmans Rho and some of the plots for the original observations and the sampled observations. In other words, we must observe how well the dependence structure was preserved. For the sake of brevity, let us denote Rosneft by , Gazprom by , FGC UES by and Sberbank by . Using AIC and MLE we have determined that: 1. is a rotated BB1 copula with = 0.1980236 and = 1.421392. 2. is a rotated BB7 copula with = 1.920555 and = 0.7580773. 3. is a rotated BB7 copula with = 2.025809 and = 0.9424809. 4. | is a rotated Gumbel copula with = 1.2104850. 5. | is a t-copula with = 0.3746501 and = 6.7874375. 6. | is a rotated BB8 copula with = 1.4269045 and = 0.8675492. The D-vine tree structure for our model is presented on Fig. 4-6. Corresponding graphs of bivariate copula density models with estimated parameters are shown in Fig. 7. Figure 4. First tree Figure 5. Second tree Figure 6. Final tree 350 DCMACS. 2019, 27 (4) 343-354 (a) (b) (c) (d) | (e) | (f) | Figure 7. Bivariate copula densities for the vine structure of our model We drew 1003 observations from our D-vine - the same number as in our real-world dataset and calculated Spearmans rho values, shown below in Table 2. Judging from Table 2 and the overlaid plots, the modeled dependencies were captured in a satisfactory way. Graphical comparison of empirical and simulated data with their scatterplots is presented on Fig. 8. Table 1 Empirical Spearmans Rho values for the original observations Table 2 Empirical Spearmans Rho values for the observations from sampling G F R G F R S 0.64 0.5 0.63 S 0.6 0.54 0.65 G - 0.49 0.67 G - 0.5 0.65 F - 0.49 F - 0.53 Figure 8. Real and simulated data comparison 6. Conclusions In this paper we have demonstrated the usefulness of the vine copulabased approach to modeling a real-world dataset with a complex dependence structure. We have successfully specified a model that captures some of the essential dependencies that characterize our dataset. In a sense, by focusing, for the sake of brevity, exclusively on C-vines and D-vines and specific methods of copula selction and parameter estimations, we were forced to neglect other approaches which could provide additional insights. Extensive functionality provided by the VineCopula package for the R programming language let us circumvent many computational dificulties, allowing for faster and more efficient analysis.</p>[K. Aas and I. Hobaek Haff, “The generalized hyperbolic skew Student’s t-distribution,” Journal of Financial Econometrics, vol. 4, pp. 275-309, Jan. 2006. DOI: 10.1093/jjfinec/nbj006.][K. Aas, C. Czado, A. Frigessi, and H. Bakken, “Pair-copula constructions of multiple dependence,” Insurance: Mathematics and Economics, vol. 44, no. 2, pp. 182-198, 2009.][D. Berg, “Copula goodness-of-fit testing: an overview and power comparison,” European Journal of Finance, vol. 15, pp. 675-701, 2009. DOI: 10.1080/13518470802697428.][T. Bedford and R. M. Cooke, “Vines-a new graphical model for dependent random variables,” The Annals of Statistics, vol. 30, no. 4, pp. 1031- 1068, 2002. DOI: 10.1214/aos/1031689016.][A. Panagiotelis, C. Czado, H. Joe, and J. Stöber, “Model selection for discrete regular vine copulas,” Comput. Stat. Data Anal., vol. 106, pp. 138-152, 2017. DOI: 10.1016/j.csda.2016.09.007.][J.-D. Fermanian, “Recent developments in copula models,” Econometrics, vol. 5, no. 34, 2017. DOI: 10.3390/econometrics5030034.][R. B. Nelsen, An introduction to copulas. New York: Springer, 1999.][H. Joe, H. Li, and A. K. Nikoloulopoulos, “Tail dependence functions and vine copulas,” Journal of Multivariate Analysis, vol. 101, pp. 252- 270, 2010. DOI: 10.1016/j.jmva.2009.08.002.][H. Joe, “Dependence comparisons of vine copulae with four or more variables,” in D. Kurowicka and H. Joe (Eds.), Dependence Modeling. Singapore: World Scientific, 2010.][A. K. Nikoloulopoulos, H. Joe, and H. Li, “Vine copulas with asymmetric tail dependence and applications to financial return data,” Computational Statistics and Data Analysis, vol. 56, no. 11, pp. 3659-3673, 2012.][Modeling dependence in econometrics. Berlin, Heidelberg: Springer Verlag, 2014. DOI: 10.1007/978-3-319-03395-2.][A. J. Patton, “Modelling asymetric exchange rate dependence,” International Economic Review, vol. 47, no. 2, pp. 527-556, 2006. DOI: 10.1111/j.1468-2354.2006.00387.x.][E. C. Brechmann, C. Czado, and K. Aas, “Truncated regular vines in high dimensions with application to financial data,” Canadian Journal of Statistics, vol. 40, no. 1, pp. 68-85, 2012. DOI: 10.1002/cjs.10141.][J. Di’́smann, E. Brechmann, C. Czado, and D. Kurowicka, “Selecting and estimating regular vine copulae and application to financial returns,” Computational Statistics & Data Analysis, vol. 59, pp. 52-69, 2013. DOI: 10.1016/j.csda.2012.09.01.][E. C. Brechmann and U. Schepsmeier, “Modeling dependence with Cand D-vine copulas: the R package CDVine,” Journal of Statistical Software, vol. 52, no. 3, pp. 1-27, 2013. DOI: 10.18637/jss.v052.i03.][S. Konishi and G. Kitagawa, Information criteria and statistical modeling. 2007. DOI: 10.1007/978-0-387-71887-3.][H. Manner and O. Reznikova, “A survey on time-varying copulas: specification, simulations and application,” Econometric Reviews, vol. 31, no. 6, pp. 654-687, 2012. DOI: 10.1080/07474938.2011.608042.][L. Chollete, A. Heinen, and A. Valdesogo, “Modeling international financial returns with a multivariate regime switching copula,” J. Financ. Econ., vol. 7, pp. 437-480, 2009. DOI: 10.2139/ssrn.1102632.][T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to algorithms, 2rd Edition. MIT Press, 2001.][D. E. Allen, M. A. Ashraf, M. McAleer, R. J. Powell, and A. K. Singh, “Financial dependence analysis: applications of vine copulas,” Statistica Neerlandica, vol. 67, no. 4, pp. 403-435, 2013. DOI: 10. 1111 / stan. 12015.]