ترغب بنشر مسار تعليمي؟ اضغط هنا

On the Statistical Efficiency of Compositional Nonparametric Prediction

73   0   0.0 ( 0 )
 نشر من قبل Yixi Xu
 تاريخ النشر 2017
والبحث باللغة English




اسأل ChatGPT حول البحث

In this paper, we propose a compositional nonparametric method in which a model is expressed as a labeled binary tree of $2k+1$ nodes, where each node is either a summation, a multiplication, or the application of one of the $q$ basis functions to one of the $p$ covariates. We show that in order to recover a labeled binary tree from a given dataset, the sufficient number of samples is $O(klog(pq)+log(k!))$, and the necessary number of samples is $Omega(klog (pq)-log(k!))$. We further propose a greedy algorithm for regression in order to validate our theoretical findings through synthetic experiments.

قيم البحث

اقرأ أيضاً

Spatio-temporal data is intrinsically high dimensional, so unsupervised modeling is only feasible if we can exploit structure in the process. When the dynamics are local in both space and time, this structure can be exploited by splitting the global field into many lower-dimensional light cones. We review light cone decompositions for predictive state reconstruction, introducing three simple light cone algorithms. These methods allow for tractable inference of spatio-temporal data, such as full-frame video. The algorithms make few assumptions on the underlying process yet have good predictive performance and can provide distributions over spatio-temporal data, enabling sophisticated probabilistic inference.
We use statistical learning methods to construct an adaptive state estimator for nonlinear stochastic systems. Optimal state estimation, in the form of a Kalman filter, requires knowledge of the systems process and measurement uncertainty. We propose that these uncertainties can be estimated from (conditioned on) past observed data, and without making any assumptions of the systems prior distribution. The systems prior distribution at each time step is constructed from an ensemble of least-squares estimates on sub-sampled sets of the data via jackknife sampling. As new data is acquired, the state estimates, process uncertainty, and measurement uncertainty are updated accordingly, as described in this manuscript.
A new procedure, called DDa-procedure, is developed to solve the problem of classifying d-dimensional objects into q >= 2 classes. The procedure is completely nonparametric; it uses q-dimensional depth plots and a very efficient algorithm for discrim ination analysis in the depth space [0,1]^q. Specifically, the depth is the zonoid depth, and the algorithm is the alpha-procedure. In case of more than two classes several binary classifications are performed and a majority rule is applied. Special treatments are discussed for outsiders, that is, data having zero depth vector. The DDa-classifier is applied to simulated as well as real data, and the results are compared with those of similar procedures that have been recently proposed. In most cases the new procedure has comparable error rates, but is much faster than other classification approaches, including the SVM.
We introduce a method for reconstructing an infinitesimal normalizing flow given only an infinitesimal change to a (possibly unnormalized) probability distribution. This reverses the conventional task of normalizing flows -- rather than being given s amples from a unknown target distribution and learning a flow that approximates the distribution, we are given a perturbation to an initial distribution and aim to reconstruct a flow that would generate samples from the known perturbed distribution. While this is an underdetermined problem, we find that choosing the flow to be an integrable vector field yields a solution closely related to electrostatics, and a solution can be computed by the method of Greens functions. Unlike conventional normalizing flows, this flow can be represented in an entirely nonparametric manner. We validate this derivation on low-dimensional problems, and discuss potential applications to problems in quantum Monte Carlo and machine learning.
In federated learning problems, data is scattered across different servers and exchanging or pooling it is often impractical or prohibited. We develop a Bayesian nonparametric framework for federated learning with neural networks. Each data server is assumed to provide local neural network weights, which are modeled through our framework. We then develop an inference approach that allows us to synthesize a more expressive global network without additional supervision, data pooling and with as few as a single communication round. We then demonstrate the efficacy of our approach on federated learning problems simulated from two popular image classification datasets.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا