No Arabic abstract
In the machine learning domain, active learning is an iterative data selection algorithm for maximizing information acquisition and improving model performance with limited training samples. It is very useful, especially for the industrial applications where training samples are expensive, time-consuming, or difficult to obtain. Existing methods mainly focus on active learning for classification, and a few methods are designed for regression such as linear regression or Gaussian process. Uncertainties from measurement errors and intrinsic input noise inevitably exist in the experimental data, which further affects the modeling performance. The existing active learning methods do not incorporate these uncertainties for Gaussian process. In this paper, we propose two new active learning algorithms for the Gaussian process with uncertainties, which are variance-based weighted active learning algorithm and D-optimal weighted active learning algorithm. Through numerical study, we show that the proposed approach can incorporate the impact from uncertainties, and realize better prediction performance. This approach has been applied to improving the predictive modeling for automatic shape control of composite fuselage.
This paper proposes an active learning-based Gaussian process (AL-GP) metamodelling method to estimate the cumulative as well as complementary cumulative distribution function (CDF/CCDF) for forward uncertainty quantification (UQ) problems. Within the field of UQ, previous studies focused on developing AL-GP approaches for reliability (rare event probability) analysis of expensive black-box solvers. A naive iteration of these algorithms with respect to different CDF/CCDF threshold values would yield a discretized CDF/CCDF. However, this approach inevitably leads to a trade-off between accuracy and computational efficiency since both depend (in opposite way) on the selected discretization. In this study, a specialized error measure and a learning function are developed such that the resulting AL-GP method is able to efficiently estimate the CDF/CCDF for a specified range of interest without an explicit dependency on discretization. Particularly, the proposed AL-GP method is able to simultaneously provide accurate CDF and CCDF estimation in their median-low probability regions. Three numerical examples are introduced to test and verify the proposed method.
Shape control is critical to ensure the quality of composite fuselage assembly. In current practice, the structures are adjusted to the design shape in terms of the $ell_2$ loss for further assembly without considering the existing dimensional gap between two structures. Such practice has two limitations: (1) the design shape may not be the optimal shape in terms of a pair of incoming fuselages with different incoming dimensions; (2) the maximum gap is the key concern during the fuselage assembly process. This paper proposes an optimal shape control methodology via the $ell_infty$ loss for composite fuselage assembly process by considering the existing dimensional gap between the incoming pair of fuselages. Besides, due to the limitation on the number of available actuators in practice, we face an important problem of finding the best locations for the actuators among many potential locations, which makes the problem a sparse estimation problem. We are the first to solve the optimal shape control in fuselage assembly process using the $ell_infty$ model under the framework of sparse estimation, where we use the $ell_1$ penalty to control the sparsity of the resulting estimator. From statistical point of view, this can be formulated as the $ell_infty$ loss based linear regression, and under some standard assumptions, such as the restricted eigenvalue (RE) conditions, and the light tailed noise, the non-asymptotic estimation error of the $ell_1$ regularized $ell_infty$ linear model is derived to be the order of $O(sigmasqrt{frac{Slog p}{n}})$, which meets the upper-bound in the existing literature. Compared to the current practice, the case study shows that our proposed method significantly reduces the maximum gap between two fuselages after shape adjustments.
We present a model that can automatically learn alignments between high-dimensional data in an unsupervised manner. Our proposed method casts alignment learning in a framework where both alignment and data are modelled simultaneously. Further, we automatically infer groupings of different types of sequences within the same dataset. We derive a probabilistic model built on non-parametric priors that allows for flexible warps while at the same time providing means to specify interpretable constraints. We demonstrate the efficacy of our approach with superior quantitative performance to the state-of-the-art approaches and provide examples to illustrate the versatility of our model in automatic inference of sequence groupings, absent from previous approaches, as well as easy specification of high level priors for different modalities of data.
In this paper, we propose a new estimation procedure for discovering the structure of Gaussian Markov random fields (MRFs) with false discovery rate (FDR) control, making use of the sorted l1-norm (SL1) regularization. A Gaussian MRF is an acyclic graph representing a multivariate Gaussian distribution, where nodes are random variables and edges represent the conditional dependence between the connected nodes. Since it is possible to learn the edge structure of Gaussian MRFs directly from data, Gaussian MRFs provide an excellent way to understand complex data by revealing the dependence structure among many inputs features, such as genes, sensors, users, documents, etc. In learning the graphical structure of Gaussian MRFs, it is desired to discover the actual edges of the underlying but unknown probabilistic graphical model-it becomes more complicated when the number of random variables (features) p increases, compared to the number of data points n. In particular, when p >> n, it is statistically unavoidable for any estimation procedure to include false edges. Therefore, there have been many trials to reduce the false detection of edges, in particular, using different types of regularization on the learning parameters. Our method makes use of the SL1 regularization, introduced recently for model selection in linear regression. We focus on the benefit of SL1 regularization that it can be used to control the FDR of detecting important random variables. Adapting SL1 for probabilistic graphical models, we show that SL1 can be used for the structure learning of Gaussian MRFs using our suggested procedure nsSLOPE (neighborhood selection Sorted L-One Penalized Estimation), controlling the FDR of detecting edges.
We present a straightforward and efficient way to control unstable robotic systems using an estimated dynamics model. Specifically, we show how to exploit the differentiability of Gaussian Processes to create a state-dependent linearized approximation of the true continuous dynamics that can be integrated with model predictive control. Our approach is compatible with most Gaussian process approaches for system identification, and can learn an accurate model using modest amounts of training data. We validate our approach by learning the dynamics of an unstable system such as a segway with a 7-D state space and 2-D input space (using only one minute of data), and we show that the resulting controller is robust to unmodelled dynamics and disturbances, while state-of-the-art control methods based on nominal models can fail under small perturbations. Code is open sourced at https://github.com/learning-and-control/core .