Do you want to publish a course? Click here

Root-finding Approaches for Computing Conformal Prediction Set

98   0   0.0 ( 0 )
 Added by Eugene Ndiaye
 Publication date 2021
and research's language is English




Ask ChatGPT about the research

Conformal prediction constructs a confidence set for an unobserved response of a feature vector based on previous identically distributed and exchangeable observations of responses and features. It has a coverage guarantee at any nominal level without additional assumptions on their distribution. Its computation deplorably requires a refitting procedure for all replacement candidates of the target response. In regression settings, this corresponds to an infinite number of model fit. Apart from relatively simple estimators that can be written as pieces of linear function of the response, efficiently computing such sets is difficult and is still considered as an open problem. We exploit the fact that, emph{often}, conformal prediction sets are intervals whose boundaries can be efficiently approximated by classical root-finding algorithm. We investigate how this approach can overcome many limitations of formerly used strategies and we discuss its complexity and drawbacks.



rate research

Read More

Though black-box predictors are state-of-the-art for many complex tasks, they often fail to properly quantify predictive uncertainty and may provide inappropriate predictions for unfamiliar data. Instead, we can learn more reliable models by letting them either output a prediction set or abstain when the uncertainty is high. We propose training these selective prediction-set models using an uncertainty-aware loss minimization framework, which unifies ideas from decision theory and robust maximum likelihood. Moreover, since black-box methods are not guaranteed to output well-calibrated prediction sets, we show how to calculate point estimates and confidence intervals for the true coverage of any selective prediction-set model, as well as a uniform mixture of K set models obtained from K-fold sample-splitting. When applied to predicting in-hospital mortality and length-of-stay for ICU patients, our model outperforms existing approaches on both in-sample and out-of-sample age groups, and our recalibration method provides accurate inference for prediction set coverage.
High-fidelity complex engineering simulations are highly predictive, but also computationally expensive and often require substantial computational efforts. The mitigation of computational burden is usually enabled through parallelism in high-performance cluster (HPC) architecture. In this paper, an asynchronous constrained batch-parallel Bayesian optimization method is proposed to efficiently solve the computationally-expensive simulation-based optimization problems on the HPC platform, with a budgeted computational resource, where the maximum number of simulations is a constant. The advantages of this method are three-fold. First, the efficiency of the Bayesian optimization is improved, where multiple input locations are evaluated massively parallel in an asynchronous manner to accelerate the optimization convergence with respect to physical runtime. This efficiency feature is further improved so that when each of the inputs is finished, another input is queried without waiting for the whole batch to complete. Second, the method can handle both known and unknown constraints. Third, the proposed method considers several acquisition functions at the same time and sample based on an evolving probability mass distribution function using a modified GP-Hedge scheme, where parameters are corresponding to the performance of each acquisition function. The proposed framework is termed aphBO-2GP-3B, which corresponds to asynchronous parallel hedge Bayesian optimization with two Gaussian processes and three batches. The aphBO-2GP-3B framework is demonstrated using two high-fidelity expensive industrial applications, where the first one is based on finite element analysis (FEA) and the second one is based on computational fluid dynamics (CFD) simulations.
Deep Learning (DL), in particular deep neural networks (DNN), by design is purely data-driven and in general does not require physics. This is the strength of DL but also one of its key limitations when applied to science and engineering problems in which underlying physical properties (such as stability, conservation, and positivity) and desired accuracy need to be achieved. DL methods in their original forms are not capable of respecting the underlying mathematical models or achieving desired accuracy even in big-data regimes. On the other hand, many data-driven science and engineering problems, such as inverse problems, typically have limited experimental or observational data, and DL would overfit the data in this case. Leveraging information encoded in the underlying mathematical models, we argue, not only compensates missing information in low data regimes but also provides opportunities to equip DL methods with the underlying physics and hence obtaining higher accuracy. This short communication introduces several model-constrained DL approaches (including both feed-forward DNN and autoencoders) that are capable of learning not only information hidden in the training data but also in the underlying mathematical models to solve inverse problems. We present and provide intuitions for our formulations for general nonlinear problems. For linear inverse problems and linear networks, the first order optimality conditions show that our model-constrained DL approaches can learn information encoded in the underlying mathematical models, and thus can produce consistent or equivalent inverse solutions, while naive purely data-based counterparts cannot.
Data-driven modeling increasingly requires to find a Nash equilibrium in multi-player games, e.g. when training GANs. In this paper, we analyse a new extra-gradient method for Nash equilibrium finding, that performs gradient extrapolations and updates on a random subset of players at each iteration. This approach provably exhibits a better rate of convergence than full extra-gradient for non-smooth convex games with noisy gradient oracle. We propose an additional variance reduction mechanism to obtain speed-ups in smooth convex games. Our approach makes extrapolation amenable to massive multiplayer settings, and brings empirical speed-ups, in particular when using a heuristic cyclic sampling scheme. Most importantly, it allows to train faster and better GANs and mixtures of GANs.
In support vector machine (SVM) applications with unreliable data that contains a portion of outliers, non-robustness of SVMs often causes considerable performance deterioration. Although many approaches for improving the robustness of SVMs have been studied, two major challenges remain in robust SVM learning. First, robust learning algorithms are essentially formulated as non-convex optimization problems. It is thus important to develop a non-convex optimization method for robust SVM that can find a good local optimal solution. The second practical issue is how one can tune the hyperparameter that controls the balance between robustness and efficiency. Unfortunately, due to the non-convexity, robust SVM solutions with slightly different hyper-parameter values can be significantly different, which makes model selection highly unstable. In this paper, we address these two issues simultaneously by introducing a novel homotopy approach to non-convex robust SVM learning. Our basic idea is to introduce parametrized formulations of robust SVM which bridge the standard SVM and fully robust SVM via the parameter that represents the influence of outliers. We characterize the necessary and sufficient conditions of the local optimal solutions of robust SVM, and develop an algorithm that can trace a path of local optimal solutions when the influence of outliers is gradually decreased. An advantage of our homotopy approach is that it can be interpreted as simulated annealing, a common approach for finding a good local optimal solution in non-convex optimization problems. In addition, our homotopy method allows stable and efficient model selection based on the path of local optimal solutions. Empirical performances of the proposed approach are demonstrated through intensive numerical experiments both on robust classification and regression problems.

suggested questions

comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا