ﻻ يوجد ملخص باللغة العربية
This article introduces subbagging (subsample aggregating) estimation approaches for big data analysis with memory constraints of computers. Specifically, for the whole dataset with size $N$, $m_N$ subsamples are randomly drawn, and each subsample with a subsample size $k_Nll N$ to meet the memory constraint is sampled uniformly without replacement. Aggregating the estimators of $m_N$ subsamples can lead to subbagging estimation. To analyze the theoretical properties of the subbagging estimator, we adapt the incomplete $U$-statistics theory with an infinite order kernel to allow overlapping drawn subsamples in the sampling procedure. Utilizing this novel theoretical framework, we demonstrate that via a proper hyperparameter selection of $k_N$ and $m_N$, the subbagging estimator can achieve $sqrt{N}$-consistency and asymptotic normality under the condition $(k_Nm_N)/Nto alpha in (0,infty]$. Compared to the full sample estimator, we theoretically show that the $sqrt{N}$-consistent subbagging estimator has an inflation rate of $1/alpha$ in its asymptotic variance. Simulation experiments are presented to demonstrate the finite sample performances. An American airline dataset is analyzed to illustrate that the subbagging estimate is numerically close to the full sample estimate, and can be computationally fast under the memory constraint.
This paper considers fixed effects estimation and inference in linear and nonlinear panel data models with random coefficients and endogenous regressors. The quantities of interest -- means, variances, and other moments of the random coefficients --
This paper considers inference on fixed effects in a linear regression model estimated from network data. An important special case of our setup is the two-way regression model. This is a workhorse technique in the analysis of matched data sets, such
In this study, we develop a novel estimation method of the quantile treatment effects (QTE) under the rank invariance and rank stationarity assumptions. Ishihara (2020) explores identification of the nonseparable panel data model under these assumpti
Factor structures or interactive effects are convenient devices to incorporate latent variables in panel data models. We consider fixed effect estimation of nonlinear panel single-index models with factor structures in the unobservables, which includ
We develop new semiparametric methods for estimating treatment effects. We focus on a setting where the outcome distributions may be thick tailed, where treatment effects are small, where sample sizes are large and where assignment is completely rand