No Arabic abstract
Data sites selected from modeling high-dimensional problems often appear scattered in non-paternalistic ways. Except for sporadic-clustering at some spots, they become relatively far apart as the dimension of the ambient space grows. These features defy any theoretical treatment that requires local or global quasi-uniformity of distribution of data sites. Incorporating a recently-developed application of integral operator theory in machine learning, we propose and study in the current article a new framework to analyze kernel interpolation of high dimensional data, which features bounding stochastic approximation error by a hybrid (discrete and continuous) $K$-functional tied to the spectrum of the underlying kernel matrix. Both theoretical analysis and numerical simulations show that spectra of kernel matrices are reliable and stable barometers for gauging the performance of kernel-interpolation methods for high dimensional data.
Numerical integration is encountered in all fields of numerical analysis and the engineering sciences. By now, various efficient and accurate quadrature rules are known; for instance, Gauss-type quadrature rules. In many applications, however, it might be impractical---if not even impossible---to obtain data to fit known quadrature rules. Often, experimental measurements are performed at equidistant or even scattered points in space or time. In this work, we propose stable high order quadrature rules for experimental data, which can accurately handle general weight functions.
With the rapid growth of data, how to extract effective information from data is one of the most fundamental problems. In this paper, based on Tikhonov regularization, we propose an effective method for reconstructing the function and its derivative from scattered data with random noise. Since the noise level is not assumed small, we will use the amount of data for reducing the random error, and use a relatively small number of knots for interpolation. An indicator function for our algorithm is constructed. It indicates where the numerical results are good or may not be good. The corresponding error estimates are obtained. We show how to choose the number of interpolation knots in the reconstruction process for balancing the random errors and interpolation errors. Numerical examples show the effectiveness and rapidity of our method. It should be remarked that the algorithm in this paper can be used for on-line data.
The quality of datasets is a critical issue in big data mining. More interesting things could be mined from datasets with higher quality. The existence of missing values in geographical data would worsen the quality of big datasets. To improve the data quality, the missing values are generally needed to be estimated using various machine learning algorithms or mathematical methods such as approximations and interpolations. In this paper, we propose an adaptive Radial Basis Function (RBF) interpolation algorithm for estimating missing values in geographical data. In the proposed method, the samples with known values are considered as the data points, while the samples with missing values are considered as the interpolated points. For each interpolated point, first, a local set of data points are adaptively determined. Then, the missing value of the interpolated point is imputed via interpolating using the RBF interpolation based on the local set of data points. Moreover, the shape factors of the RBF are also adaptively determined by considering the distribution of the local set of data points. To evaluate the performance of the proposed method, we compare our method with the commonly used k Nearest Neighbors (kNN) interpolation and Adaptive Inverse Distance Weighted (AIDW) methods, and conduct three groups of benchmark experiments. Experimental results indicate that the proposed method outperforms the kNN interpolation and AIDW in terms of accuracy, but worse than the kNN interpolation and AIDW in terms of efficiency.
The error between appropriately smooth functions and their radial basis function interpolants, as the interpolation points fill out a bounded domain in R^d, is a well studied artifact. In all of these cases, the analysis takes place in a natural function space dictated by the choice of radial basis function -- the native space. The native space contains functions possessing a certain amount of smoothness. This paper establishes error estimates when the function being interpolated is conspicuously rough.
In this work, we propose and investigate stable high-order collocation-type discretisations of the discontinuous Galerkin method on equidistant and scattered collocation points. We do so by incorporating the concept of discrete least squares into the discontinuous Galerkin framework. Discrete least squares approximations allow us to construct stable and high-order accurate approximations on arbitrary collocation points, while discrete least squares quadrature rules allow us their stable and exact numerical integration. Both methods are computed efficiently by using bases of discrete orthogonal polynomials. Thus, the proposed discretisation generalises known classes of discretisations of the discontinuous Galerkin method, such as the discontinuous Galerkin collocation spectral element method. We are able to prove conservation and linear $L^2$-stability of the proposed discretisations. Finally, numerical tests investigate their accuracy and demonstrate their extension to nonlinear conservation laws, systems, longtime simulations, and a variable coefficient problem in two space dimensions.