No Arabic abstract
The multilabel learning problem with large number of labels, features, and data-points has generated a tremendous interest recently. A recurring theme of these problems is that only a few labels are active in any given datapoint as compared to the total number of labels. However, only a small number of existing work take direct advantage of this inherent extreme sparsity in the label space. By the virtue of Restricted Isometry Property (RIP), satisfied by many random ensembles, we propose a novel procedure for multilabel learning known as RIPML. During the training phase, in RIPML, labels are projected onto a random low-dimensional subspace followed by solving a least-square problem in this subspace. Inference is done by a k-nearest neighbor (kNN) based approach. We demonstrate the effectiveness of RIPML by conducting extensive simulations and comparing results with the state-of-the-art linear dimensionality reduction based approaches.
Matrices satisfying the Restricted Isometry Property (RIP) play an important role in the areas of compressed sensing and statistical learning. RIP matrices with optimal parameters are mainly obtained via probabilistic arguments, as explicit constructions seem hard. It is therefore interesting to ask whether a fixed matrix can be incorporated into a construction of restricted isometries. In this paper, we construct a new broad ensemble of random matrices with dependent entries that satisfy the restricted isometry property. Our construction starts with a fixed (deterministic) matrix $X$ satisfying some simple stable rank condition, and we show that the matrix $XR$, where $R$ is a random matrix drawn from various popular probabilistic models (including, subgaussian, sparse, low-randomness, satisfying convex concentration property), satisfies the RIP with high probability. These theorems have various applications in signal recovery, random matrix theory, dimensionality reduction, etc. Additionally, motivated by an application for understanding the effectiveness of word vector embeddings popular in natural language processing and machine learning applications, we investigate the RIP of the matrix $XR^{(l)}$ where $R^{(l)}$ is formed by taking all possible (disregarding order) $l$-way entrywise products of the columns of a random matrix $R$.
This paper is concerned with the performance of Orthogonal Matching Pursuit (OMP) algorithms applied to a dictionary $mathcal{D}$ in a Hilbert space $mathcal{H}$. Given an element $fin mathcal{H}$, OMP generates a sequence of approximations $f_n$, $n=1,2,dots$, each of which is a linear combination of $n$ dictionary elements chosen by a greedy criterion. It is studied whether the approximations $f_n$ are in some sense comparable to {em best $n$ term approximation} from the dictionary. One important result related to this question is a theorem of Zhang cite{TZ} in the context of sparse recovery of finite dimensional signals. This theorem shows that OMP exactly recovers $n$-sparse signal, whenever the dictionary $mathcal{D}$ satisfies a Restricted Isometry Property (RIP) of order $An$ for some constant $A$, and that the procedure is also stable in $ell^2$ under measurement noise. The main contribution of the present paper is to give a structurally simpler proof of Zhangs theorem, formulated in the general context of $n$ term approximation from a dictionary in arbitrary Hilbert spaces $mathcal{H}$. Namely, it is shown that OMP generates near best $n$ term approximations under a similar RIP condition.
Email has remained a principal form of communication among people, both in enterprise and social settings. With a deluge of emails crowding our mailboxes daily, there is a dire need of smart email systems that can recover important emails and make personalized recommendations. In this work, we study the problem of predicting user triage actions to incoming emails where we take the reply prediction as a working example. Different from existing methods, we formulate the triage action prediction as a recommendation problem and focus on the content-based approach, where the users are represented using the content of current and past emails. We also introduce additional similarity features to further explore the affinities between users and emails. Experiments on the publicly available Avocado email collection demonstrate the advantages of our proposed recommendation framework and our method is able to achieve better performance compared to the state-of-the-art deep recommendation methods. More importantly, we provide valuable insight into the effectiveness of different textual and user representations and show that traditional bag-of-words approaches, with the help from the similarity features, compete favorably with the more advanced neural embedding methods.
This paper proposes CF-NADE, a neural autoregressive architecture for collaborative filtering (CF) tasks, which is inspired by the Restricted Boltzmann Machine (RBM) based CF model and the Neural Autoregressive Distribution Estimator (NADE). We first describe the basic CF-NADE model for CF tasks. Then we propose to improve the model by sharing parameters between different ratings. A factored version of CF-NADE is also proposed for better scalability. Furthermore, we take the ordinal nature of the preferences into consideration and propose an ordinal cost to optimize CF-NADE, which shows superior performance. Finally, CF-NADE can be extended to a deep model, with only moderately increased computational complexity. Experimental results show that CF-NADE with a single hidden layer beats all previous state-of-the-art methods on MovieLens 1M, MovieLens 10M, and Netflix datasets, and adding more hidden layers can further improve the performance.
Recommender systems are being employed across an increasingly diverse set of domains that can potentially make a significant social and individual impact. For this reason, considering fairness is a critical step in the design and evaluation of such systems. In this paper, we introduce HyperFair, a general framework for enforcing soft fairness constraints in a hybrid recommender system. HyperFair models integrate variations of fairness metrics as a regularization of a joint inference objective function. We implement our approach using probabilistic soft logic and show that it is particularly well-suited for this task as it is expressive and structural constraints can be added to the system in a concise and interpretable manner. We propose two ways to employ the methods we introduce: first as an extension of a probabilistic soft logic recommender system template; second as a fair retrofitting technique that can be used to improve the fairness of predictions from a black-box model. We empirically validate our approach by implementing multiple HyperFair hybrid recommenders and compare them to a state-of-the-art fair recommender. We also run experiments showing the effectiveness of our methods for the task of retrofitting a black-box model and the trade-off between the amount of fairness enforced and the prediction performance.