No Arabic abstract
We introduce a method to predict which correlation matrix coefficients are likely to change their signs in the future in the high-dimensional regime, i.e. when the number of features is larger than the number of samples per feature. The stability of correlation signs, two-by-two relationships, is found to depend on three-by-three relationships inspired by Heider social cohesion theory in this regime. We apply our method to US and Hong Kong equities historical data to illustrate how the structure of correlation matrices influences the stability of the sign of its coefficients.
We propose a novel approach to sentiment data filtering for a portfolio of assets. In our framework, a dynamic factor model drives the evolution of the observed sentiment and allows to identify two distinct components: a long-term component, modeled as a random walk, and a short-term component driven by a stationary VAR(1) process. Our model encompasses alternative approaches available in literature and can be readily estimated by means of Kalman filtering and expectation maximization. This feature makes it convenient when the cross-sectional dimension of the portfolio increases. By applying the model to a portfolio of Dow Jones stocks, we find that the long term component co-integrates with the market principal factor, while the short term one captures transient swings of the market associated with the idiosyncratic components and captures the correlation structure of returns. Using quantile regressions, we assess the significance of the contemporaneous and lagged explanatory power of sentiment on returns finding strong statistical evidence when extreme returns, especially negative ones, are considered. Finally, the lagged relation is exploited in a portfolio allocation exercise.
Heterogeneity of economic agents is emphasized in a new trend of macroeconomics. Accordingly the new emerging discipline requires one to replace the production function, one of key ideas in the conventional economics, by an alternative which can take an explicit account of distribution of firms production activities. In this paper we propose a new idea referred to as production copula; a copula is an analytic means for modeling dependence among variables. Such a production copula predicts value added yielded by firms with given capital and labor in a probabilistic way. It is thereby in sharp contrast to the production function where the output of firms is completely deterministic. We demonstrate empirical construction of a production copula using financial data of listed firms in Japan. Analysis of the data shows that there are significant correlations among their capital, labor and value added and confirms that the values added are too widely scattered to be represented by a production function. We employ four models for the production copula, that is, trivaria
In this study, the fluctuation-dissipation theory is invoked to shed light on input-output interindustrial relations at a macroscopic level by its application to IIP (indices of industrial production) data for Japan. Statistical noise arising from finiteness of the time series data is carefully removed by making use of the random matrix theory in an eigenvalue analysis of the correlation matrix; as a result, two dominant eigenmodes are detected. Our previous study successfully used these two modes to demonstrate the existence of intrinsic business cycles. Here a correlation matrix constructed from the two modes describes genuine interindustrial correlations in a statistically meaningful way. Further it enables us to quantitatively discuss the relationship between shipments of final demand goods and production of intermediate goods in a linear response framework. We also investigate distinctive external stimuli for the Japanese economy exerted by the current global economic crisis. These stimuli are derived from residuals of moving average fluctuations of the IIP remaining after subtracting the long-period components arising from inherent business cycles. The observation reveals that the fluctuation-dissipation theory is applicable to an economic system that is supposed to be far from physical equilibrium.
Using public data (Forbes Global 2000) we show that the asset sizes for the largest global firms follow a Pareto distribution in an intermediate range, that is ``interrupted by a sharp cut-off in its upper tail, where it is totally dominated by financial firms. This flattening of the distribution contrasts with a large body of empirical literature which finds a Pareto distribution for firm sizes both across countries and over time. Pareto distributions are generally traced back to a mechanism of proportional random growth, based on a regime of constant returns to scale. This makes our findings of an ``interrupted Pareto distribution all the more puzzling, because we provide evidence that financial firms in our sample should operate in such a regime. We claim that the missing mass from the upper tail of the asset size distribution is a consequence of shadow banking activity and that it provides an (upper) estimate of the size of the shadow banking system. This estimate -- which we propose as a shadow banking index -- compares well with estimates of the Financial Stability Board until 2009, but it shows a sharper rise in shadow banking activity after 2010. Finally, we propose a proportional random growth model that reproduces the observed distribution, thereby providing a quantitative estimate of the intensity of shadow banking activity.
Our recent study of a nation-wide production network uncovered a community structure, namely how firms are connected by supplier-customer links into tightly-knit groups with high density in intra-groups and with lower connectivity in inter-groups. Here we propose a method to visualize the community structure by a graph layout based on a physical analogy. The layout can be calculated in a practical computation-time and is possible to be accelerated by a special-purpose device of GRAPE (gravity pipeline) developed for astrophysical N-body simulation. We show that the method successfully identifies the communities in a hierarchical way by applying it to the manufacturing sector comprising tenth million nodes and a half million edges. In addition, we discuss several limitations of this method, and propose a possible way to avoid all those problems.