No Arabic abstract
WTA (Winner Take All) hashing has been successfully applied in many large scale vision applications. This hashing scheme was tailored to take advantage of the comparative reasoning (or order based information), which showed significant accuracy improvements. In this paper, we identify a subtle issue with WTA, which grows with the sparsity of the datasets. This issue limits the discriminative power of WTA. We then propose a solution for this problem based on the idea of Densification which provably fixes the issue. Our experiments show that Densified WTA Hashing outperforms Vanilla WTA both in image classification and retrieval tasks consistently and significantly.
Inspired by the advances in biological science, the study of sparse binary projection models has attracted considerable recent research attention. The models project dense input samples into a higher-dimensional space and output sparse binary data representations after the Winner-Take-All competition, subject to the constraint that the projection matrix is also sparse and binary. Following the work along this line, we developed a supervised-WTA model when training samples with both input and output representations are available, from which the optimal projection matrix can be obtained with a simple, effective yet efficient algorithm. We further extended the model and the algorithm to an unsupervised setting where only the input representation of the samples is available. In a series of empirical evaluation on similarity search tasks, the proposed models reported significantly improved results over the state-of-the-art methods in both search accuracies and running speed. The successful results give us strong confidence that the work provides a highly practical tool to real world applications.
In this work we study biological neural networks from an algorithmic perspective, focusing on understanding tradeoffs between computation time and network complexity. Our goal is to abstract real neural networks in a way that, while not capturing all interesting features, preserves high-level behavior and allows us to make biologically relevant conclusions. Towards this goal, we consider the implementation of algorithmic primitives in a simple yet biologically plausible model of $stochastic spiking neural networks$. In particular, we show how the stochastic behavior of neurons in this model can be leveraged to solve a basic $symmetry-breaking task$ in which we are given neurons with identical firing rates and want to select a distinguished one. In computational neuroscience, this is known as the winner-take-all (WTA) problem, and it is believed to serve as a basic building block in many tasks, e.g., learning, pattern recognition, and clustering. We provide efficient constructions of WTA circuits in our stochastic spiking neural network model, as well as lower bounds in terms of the number of auxiliary neurons required to drive convergence to WTA in a given number of steps. These lower bounds demonstrate that our constructions are near-optimal in some cases. This work covers and gives more in-depth proofs of a subset of results originally published in [LMP17a]. It is adapted from the last chapter of C. Muscos Ph.D. thesis [Mus18].
Winner-Take-All (WTA) refers to the neural operation that selects a (typically small) group of neurons from a large neuron pool. It is conjectured to underlie many of the brains fundamental computational abilities. However, not much is known about the robustness of a spike-based WTA network to the inherent randomness of the input spike trains. In this work, we consider a spike-based $k$--WTA model wherein $n$ randomly generated input spike trains compete with each other based on their underlying statistics, and $k$ winners are supposed to be selected. We slot the time evenly with each time slot of length $1, ms$, and model the $n$ input spike trains as $n$ independent Bernoulli processes. The Bernoulli process is a good approximation of the popular Poisson process but is more biologically relevant as it takes the refractory periods into account. Due to the randomness in the input spike trains, no circuits can guarantee to successfully select the correct winners in finite time. We focus on analytically characterizing the minimal amount of time needed so that a target minimax decision accuracy (success probability) can be reached. We first derive an information-theoretic lower bound on the decision time. We show that to have a (minimax) decision error $le delta$ (where $delta in (0,1)$), the computation time of any WTA circuit is at least [ ((1-delta) log(k(n -k)+1) -1)T_{mathcal{R}}, ] where $T_{mathcal{R}}$ is a difficulty parameter of a WTA task that is independent of $delta$, $n$, and $k$. We then design a simple WTA circuit whose decision time is [ O( logfrac{1}{delta}+log k(n-k))T_{mathcal{R}}). ] It turns out that for any fixed $delta in (0,1)$, this decision time is order-optimal in terms of its scaling in $n$, $k$, and $T_{mathcal{R}}$.
Influence competition finds its significance in many applications, such as marketing, politics and public events like COVID-19. Existing work tends to believe that the stronger influence will always win and dominate nearly the whole network, i.e., winner takes all. However, this finding somewhat contradicts with our common sense that many competing products are actually coexistent, e.g., Android vs. iOS. This contradiction naturally raises the question: will the winner take all? To answer this question, we make a comprehensive study into influence competition by identifying two factors frequently overlooked by prior art: (1) the incomplete observation of real diffusion networks; (2) the existence of information overload and its impact on user behaviors. To this end, we attempt to recover possible diffusion links based on user similarities, which are extracted by embedding users into a latent space. Following this, we further derive the condition under which users will be overloaded, and formulate the competing processes where users behaviors differ before and after information overload. By establishing the explicit expressions of competing dynamics, we disclose that information overload acts as the critical boundary line, before which the winner takes all phenomenon will definitively occur, whereas after information overload the share of influences gradually stabilizes and is jointly affected by their initial spreading conditions, influence powers and the advent of overload. Numerous experiments are conducted to validate our theoretical results where favorable agreement is found. Our work sheds light on the intrinsic driving forces behind real-world dynamics, thus providing useful insights into effective information engineering.
Online image hashing has received increasing research attention recently, which processes large-scale data in a streaming fashion to update the hash functions on-the-fly. To this end, most existing works exploit this problem under a supervised setting, i.e., using class labels to boost the hashing performance, which suffers from the defects in both adaptivity and efficiency: First, large amounts of training batches are required to learn up-to-date hash functions, which leads to poor online adaptivity. Second, the training is time-consuming, which contradicts with the core need of online learning. In this paper, a novel supervised online hashing scheme, termed Fast Class-wise Updating for Online Hashing (FCOH), is proposed to address the above two challenges by introducing a novel and efficient inner product operation. To achieve fast online adaptivity, a class-wise updating method is developed to decompose the binary code learning and alternatively renew the hash functions in a class-wise fashion, which well addresses the burden on large amounts of training batches. Quantitatively, such a decomposition further leads to at least 75% storage saving. To further achieve online efficiency, we propose a semi-relaxation optimization, which accelerates the online training by treating different binary constraints independently. Without additional constraints and variables, the time complexity is significantly reduced. Such a scheme is also quantitatively shown to well preserve past information during updating hashing functions. We have quantitatively demonstrated that the collective effort of class-wise updating and semi-relaxation optimization provides a superior performance comparing to various state-of-the-art methods, which is verified through extensive experiments on three widely-used datasets.