No Arabic abstract
In this work we study the quantitative relation between the recursive teaching dimension (RTD) and the VC dimension (VCD) of concept classes of finite sizes. The RTD of a concept class $mathcal C subseteq {0, 1}^n$, introduced by Zilles et al. (2011), is a combinatorial complexity measure characterized by the worst-case number of examples necessary to identify a concept in $mathcal C$ according to the recursive teaching model. For any finite concept class $mathcal C subseteq {0,1}^n$ with $mathrm{VCD}(mathcal C)=d$, Simon & Zilles (2015) posed an open problem $mathrm{RTD}(mathcal C) = O(d)$, i.e., is RTD linearly upper bounded by VCD? Previously, the best known result is an exponential upper bound $mathrm{RTD}(mathcal C) = O(d cdot 2^d)$, due to Chen et al. (2016). In this paper, we show a quadratic upper bound: $mathrm{RTD}(mathcal C) = O(d^2)$, much closer to an answer to the open problem. We also discuss the challenges in fully solving the problem.
Adversarial attacks during the testing phase of neural networks pose a challenge for the deployment of neural networks in security critical settings. These attacks can be performed by adding noise that is imperceptible to humans on top of the original data. By doing so, an attacker can create an adversarial sample, which will cause neural networks to misclassify. In this paper, we seek to understand the theoretical limits of what can be learned by neural networks in the presence of an adversary. We first defined the hypothesis space of a neural network, and showed the relationship between the growth number of the entire neural network and the growth number of each neuron. Combine that with the adversarial Vapnik-Chervonenkis(VC)-dimension of halfspace classifiers, we concluded the adversarial VC-dimension of the neural networks with sign activation functions.
This note sharpens the standard upper bound of the least quadratic nonresidue from $n_pll p^{1/4sqrt{e}+varepsilon}$ to $n_pll p^{1/4e+varepsilon}$, where $varepsilon>0$, unconditionally.
We provide a negative resolution to a conjecture of Steinke and Zakynthinou (2020a), by showing that their bound on the conditional mutual information (CMI) of proper learners of Vapnik--Chervonenkis (VC) classes cannot be improved from $d log n +2$ to $O(d)$, where $n$ is the number of i.i.d. training examples. In fact, we exhibit VC classes for which the CMI of any proper learner cannot be bounded by any real-valued function of the VC dimension only.
Contextual Bandits find important use cases in various real-life scenarios such as online advertising, recommendation systems, healthcare, etc. However, most of the algorithms use flat feature vectors to represent context whereas, in the real world, there is a varying number of objects and relations among them to model in the context. For example, in a music recommendation system, the user context contains what music they listen to, which artists create this music, the artist albums, etc. Adding richer relational context representations also introduces a much larger context space making exploration-exploitation harder. To improve the efficiency of exploration-exploitation knowledge about the context can be infused to guide the exploration-exploitation strategy. Relational context representations allow a natural way for humans to specify knowledge owing to their descriptive nature. We propose an adaptation of Knowledge Infused Policy Gradients to the Contextual Bandit setting and a novel Knowledge Infused Policy Gradients Upper Confidence Bound algorithm and perform an experimental analysis of a simulated music recommendation dataset and various real-life datasets where expert knowledge can drastically reduce the total regret and where it cannot.
Let $G$ be a finite (not necessarily abelian) group and let $p=p(G)$ be the smallest prime number dividing $|G|$. We prove that $d(G)leq frac{|G|}{p}+9p^2-10p$, where $d(G)$ denotes the small Davenport constant of $G$ which is defined as the maximal integer $ell$ such that there is a sequence over $G$ of length $ell$ contains no nonempty one-product subsequence.