No Arabic abstract
In a recent paper Juodis and Reese (2021) (JR) show that the application of the CD test proposed by Pesaran (2004) to residuals from panels with latent factors results in over-rejection and propose a randomized test statistic to correct for over-rejection, and add a screening component to achieve power. This paper considers the same problem but from a different perspective and shows that the standard CD test remains valid if the latent factors are weak, and proposes a simple bias-corrected CD test, labelled CD*, which is shown to be asymptotically normal, irrespective of whether the latent factors are weak or strong. This result is shown to hold for pure latent factor models as well as for panel regressions with latent factors. Small sample properties of the CD* test are investigated by Monte Carlo experiments and are shown to have the correct size and satisfactory power for both Gaussian and non-Gaussian errors. In contrast, it is found that JRs test tends to over-reject in the case of panels with non-Gaussian errors, and have low power against spatial network alternatives. The use of the CD* test is illustrated with two empirical applications from the literature.
Accurate estimation for extent of cross{sectional dependence in large panel data analysis is paramount to further statistical analysis on the data under study. Grouping more data with weak relations (cross{sectional dependence) together often results in less efficient dimension reduction and worse forecasting. This paper describes cross-sectional dependence among a large number of objects (time series) via a factor model and parameterizes its extent in terms of strength of factor loadings. A new joint estimation method, benefiting from unique feature of dimension reduction for high dimensional time series, is proposed for the parameter representing the extent and some other parameters involved in the estimation procedure. Moreover, a joint asymptotic distribution for a pair of estimators is established. Simulations illustrate the effectiveness of the proposed estimation method in the finite sample performance. Applications in cross-country macro-variables and stock returns from S&P 500 are studied.
We consider a testing problem for cross-sectional dependence for high-dimensional panel data, where the number of cross-sectional units is potentially much larger than the number of observations. The cross-sectional dependence is described through a linear regression model. We study three tests named the sum test, the max test and the max-sum test, where the latter two are new. The sum test is initially proposed by Breusch and Pagan (1980). We design the max and sum tests for sparse and non-sparse residuals in the linear regressions, respectively.And the max-sum test is devised to compromise both situations on the residuals. Indeed, our simulation shows that the max-sum test outperforms the previous two tests. This makes the max-sum test very useful in practice where sparsity or not for a set of data is usually vague. Towards the theoretical analysis of the three tests, we have settled two conjectures regarding the sum of squares of sample correlation coefficients asked by Pesaran (2004 and 2008). In addition, we establish the asymptotic theory for maxima of sample correlations coefficients appeared in the linear regression model for panel data, which is also the first successful attempt to our knowledge. To study the max-sum test, we create a novel method to show asymptotic independence between maxima and sums of dependent random variables. We expect the method itself is useful for other problems of this nature. Finally, an extensive simulation study as well as a case study are carried out. They demonstrate advantages of our proposed methods in terms of both empirical powers and robustness for residuals regardless of sparsity or not.
We study identification and estimation of causal effects in settings with panel data. Traditionally researchers follow model-based identification strategies relying on assumptions governing the relation between the potential outcomes and the unobserved confounders. We focus on a novel, complementary, approach to identification where assumptions are made about the relation between the treatment assignment and the unobserved confounders. We introduce different sets of assumptions that follow the two paths to identification, and develop a double robust approach. We propose estimation methods that build on these identification strategies.
In this paper, a statistical model for panel data with unobservable grouped factor structures which are correlated with the regressors and the group membership can be unknown. The factor loadings are assumed to be in different subspaces and the subspace clustering for factor loadings are considered. A method called least squares subspace clustering estimate (LSSC) is proposed to estimate the model parameters by minimizing the least-square criterion and to perform the subspace clustering simultaneously. The consistency of the proposed subspace clustering is proved and the asymptotic properties of the estimation procedure are studied under certain conditions. A Monte Carlo simulation study is used to illustrate the advantages of the proposed method. Further considerations for the situations that the number of subspaces for factors, the dimension of factors and the dimension of subspaces are unknown are also discussed. For illustrative purposes, the proposed method is applied to study the linkage between income and democracy across countries while subspace patterns of unobserved factors and factor loadings are allowed.
We propose a generalization of the linear panel quantile regression model to accommodate both textit{sparse} and textit{dense} parts: sparse means while the number of covariates available is large, potentially only a much smaller number of them have a nonzero impact on each conditional quantile of the response variable; while the dense part is represent by a low-rank matrix that can be approximated by latent factors and their loadings. Such a structure poses problems for traditional sparse estimators, such as the $ell_1$-penalised Quantile Regression, and for traditional latent factor estimator, such as PCA. We propose a new estimation procedure, based on the ADMM algorithm, consists of combining the quantile loss function with $ell_1$ textit{and} nuclear norm regularization. We show, under general conditions, that our estimator can consistently estimate both the nonzero coefficients of the covariates and the latent low-rank matrix. Our proposed model has a Characteristics + Latent Factors Asset Pricing Model interpretation: we apply our model and estimator with a large-dimensional panel of financial data and find that (i) characteristics have sparser predictive power once latent factors were controlled (ii) the factors and coefficients at upper and lower quantiles are different from the median.