ترغب بنشر مسار تعليمي؟ اضغط هنا

Recent works show that group equivariance as an inductive bias improves neural network performance for both classification and generation. However, designing group-equivariant neural networks is challenging when the group of interest is large and is unknown. Moreover, inducing equivariance can significantly reduce the number of independent parameters in a network with fixed feature size, affecting its overall performance. We address these problems by proving a new group-theoretic result in the context of equivariant neural networks that shows that a network is equivariant to a large group if and only if it is equivariant to smaller groups from which it is constructed. Using this result, we design a novel fast group equivariant construction algorithm, and a deep Q-learning-based search algorithm in a reduced search space, yielding what we call autoequivariant networks (AENs). AENs find the right balance between equivariance and network size when tested on new benchmark datasets, G-MNIST and G-Fashion-MNIST, obtained via group transformations on MNIST and Fashion-MNIST respectively that we release. Extending these results to group convolutional neural networks, where we optimize between equivariances, augmentations, and network sizes, we find group equivariance to be the most dominating factor in all high-performing GCNNs on several datasets like CIFAR10, SVHN, RotMNIST, ASL, EMNIST, and KMNIST.
A stochastic multi-user multi-armed bandit framework is used to develop algorithms for uncoordinated spectrum access. In contrast to prior work, it is assumed that rewards can be non-zero even under collisions, thus allowing for the number of users t o be greater than the number of channels. The proposed algorithm consists of an estimation phase and an allocation phase. It is shown that if every user adopts the algorithm, the system wide regret is order-optimal of order $O(log T)$ over a time-horizon of duration $T$. The regret guarantees hold for both the cases where the number of users is greater than or less than the number of channels. The algorithm is extended to the dynamic case where the number of users in the system evolves over time, and is shown to lead to sub-linear regret.
Multi-user multi-armed bandits have emerged as a good model for uncoordinated spectrum access problems. In this paper we consider the scenario where users cannot communicate with each other. In addition, the environment may appear differently to diff erent users, ${i.e.}$, the mean rewards as observed by different users for the same channel may be different. With this setup, we present a policy that achieves a regret of $O (log{T})$. This paper has been accepted at Asilomar Conference on Signals, Systems, and Computers 2019.
We consider a fully decentralized multi-player stochastic multi-armed bandit setting where the players cannot communicate with each other and can observe only their own actions and rewards. The environment may appear differently to different players, $textit{i.e.}$, the reward distributions for a given arm are heterogeneous across players. In the case of a collision (when more than one player plays the same arm), we allow for the colliding players to receive non-zero rewards. The time-horizon $T$ for which the arms are played is emph{not} known to the players. Within this setup, where the number of players is allowed to be greater than the number of arms, we present a policy that achieves near order-optimal expected regret of order $O(log^{1 + delta} T)$ for some $0 < delta < 1$ over a time-horizon of duration $T$. This paper is currently under review at IEEE Transactions on Information Theory.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا