No Arabic abstract
A reciprocal recommendation problem is one where the goal of learning is not just to predict a users preference towards a passive item (e.g., a book), but to recommend the targeted user on one side another user from the other side such that a mutual interest between the two exists. The problem thus is sharply different from the more traditional items-to-users recommendation, since a good match requires meeting the preferences of both users. We initiate a rigorous theoretical investigation of the reciprocal recommendation task in a specific framework of sequential learning. We point out general limitations, formulate reasonable assumptions enabling effective learning and, under these assumptions, we design and analyze a computationally efficient algorithm that uncovers mutual likes at a pace comparable to those achieved by a clearvoyant algorithm knowing all user preferences in advance. Finally, we validate our algorithm against synthetic and real-world datasets, showing improved empirical performance over simple baselines.
Federated learning (FL) has emerged as a prominent distributed learning paradigm. FL entails some pressing needs for developing novel parameter estimation approaches with theoretical guarantees of convergence, which are also communication efficient, differentially private and Byzantine resilient in the heterogeneous data distribution settings. Quantization-based SGD solvers have been widely adopted in FL and the recently proposed SIGNSGD with majority vote shows a promising direction. However, no existing methods enjoy all the aforementioned properties. In this paper, we propose an intuitively-simple yet theoretically-sound method based on SIGNSGD to bridge the gap. We present Stochastic-Sign SGD which utilizes novel stochastic-sign based gradient compressors enabling the aforementioned properties in a unified framework. We also present an error-feedback variant of the proposed Stochastic-Sign SGD which further improves the learning performance in FL. We test the proposed method with extensive experiments using deep neural networks on the MNIST dataset and the CIFAR-10 dataset. The experimental results corroborate the effectiveness of the proposed method.
We investigate the problem of active learning in the streaming setting in non-parametric regimes, where the labels are stochastically generated from a class of functions on which we make no assumptions whatsoever. We rely on recently proposed Neural Tangent Kernel (NTK) approximation tools to construct a suitable neural embedding that determines the feature space the algorithm operates on and the learned model computed atop. Since the shape of the label requesting threshold is tightly related to the complexity of the function to be learned, which is a-priori unknown, we also derive a version of the algorithm which is agnostic to any prior knowledge. This algorithm relies on a regret balancing scheme to solve the resulting online model selection problem, and is computationally efficient. We prove joint guarantees on the cumulative regret and number of requested labels which depend on the complexity of the labeling function at hand. In the linear case, these guarantees recover known minimax results of the generalization error as a function of the label complexity in a standard statistical learning setting.
Machine learning has shown much promise in helping improve the quality of medical, legal, and economic decision-making. In these applications, machine learning models must satisfy two important criteria: (i) they must be causal, since the goal is typically to predict individual treatment effects, and (ii) they must be interpretable, so that human decision makers can validate and trust the model predictions. There has recently been much progress along each direction independently, yet the state-of-the-art approaches are fundamentally incompatible. We propose a framework for learning causal interpretable models---from observational data---that can be used to predict individual treatment effects. Our framework can be used with any algorithm for learning interpretable models. Furthermore, we prove an error bound on the treatment effects predicted by our model. Finally, in an experiment on real-world data, we show that the models trained using our framework significantly outperform a number of baselines.
We propose a new framework to solve online optimization and learning problems in unknown and uncertain dynamical environments. This framework enables us to simultaneously learn the uncertain dynamical environment while making online decisions in a quantifiably robust manner. The main technical approach relies on the theory of distributional robust optimization that leverages adaptive probabilistic ambiguity sets. However, as defined, the ambiguity set usually leads to online intractable problems, and the first part of our work is directed to find reformulations in the form of online convex problems for two sub-classes of objective functions. To solve the resulting problems in the proposed framework, we further introduce an online version of the Nesterov accelerated-gradient algorithm. We determine how the proposed solution system achieves a probabilistic regret bound under certain conditions. Two applications illustrate the applicability of the proposed framework.
With the recent success of representation learning methods, which includes deep learning as a special case, there has been considerable interest in developing representation learning techniques that can incorporate known physical constraints into the learned representation. As one example, in many applications that involve a signal propagating through physical media (e.g., optics, acoustics, fluid dynamics, etc), it is known that the dynamics of the signal must satisfy constraints imposed by the wave equation. Here we propose a matrix factorization technique that decomposes such signals into a sum of components, where each component is regularized to ensure that it satisfies wave equation constraints. Although our proposed formulation is non-convex, we prove that our model can be efficiently solved to global optimality in polynomial time. We demonstrate the benefits of our work by applications in structural health monitoring, where prior work has attempted to solve this problem using sparse dictionary learning approaches that do not come with any theoretical guarantees regarding convergence to global optimality and employ heuristics to capture desired physical constraints.