No Arabic abstract
Virtually all of deep learning literature relies on the assumption of large amounts of available training data. Indeed, even the majority of few-shot learning methods rely on a large set of base classes for pretraining. This assumption, however, does not always hold. For some tasks, annotating a large number of classes can be infeasible, and even collecting the images themselves can be a challenge in some scenarios. In this paper, we study this problem and call it Small Data setting, in contrast to Big Data. To unlock the full potential of small data, we propose to augment the models with annotations for other related tasks, thus increasing their generalization abilities. In particular, we use the richly annotated scene parsing dataset ADE20K to construct our realistic Long-tail Recognition with Diverse Supervision (LRDS) benchmark by splitting the object categories into head and tail based on their distribution. Following the standard few-shot learning protocol, we use the head classes for representation learning and the tail classes for evaluation. Moreover, we further subsample the head categories and images to generate two novel settings which we call Scarce-Class and Scarce-Image, respectively corresponding to the shortage of samples for rare classes and training images. Finally, we analyze the effect of applying various additional supervision sources under the proposed settings. Our experiments demonstrate that densely labeling a small set of images can indeed largely remedy the small data constraints.
Efficient Nearest Neighbor (NN) search in high-dimensional spaces is a foundation of many multimedia retrieval systems. A common approach is to rely on Product Quantization, which allows the storage of large vector databases in memory and efficient distance computations. Yet, implementations of nearest neighbor search with Product Quantization have their performance limited by the many memory accesses they perform. Following this observation, Andre et al. proposed Quick ADC with up to $6times$ faster implementations of $mtimes{}4$ product quantizers (PQ) leveraging specific SIMD instructions. Quicker ADC is a generalization of Quick ADC not limited to $mtimes{}4$ codes and supporting AVX-512, the latest revision of SIMD instruction set. In doing so, Quicker ADC faces the challenge of using efficiently 5,6 and 7-bit shuffles that do not align to computer bytes or words. To this end, we introduce (i) irregular product quantizers combining sub-quantizers of different granularity and (ii) split tables allowing lookup tables larger than registers. We evaluate Quicker ADC with multiple indexes including Inverted Multi-Indexes and IVF HNSW and show that it outperforms the reference optimized implementations (i.e., FAISS and polysemous codes) for numerous configurations. Finally, we release an open-source fork of FAISS enhanced with Quicker ADC at http://github.com/nlescoua/faiss-quickeradc.
Lyman-$alpha$ (Ly$alpha$) is a powerful astrophysical probe. Not only is it ubiquitous at high redshifts, it is also a resonant line, making Ly$alpha$ photons scatter. This scattering process depends on the physical conditions of the gas through which Ly$alpha$ propagates, and these conditions are imprinted on observables such as the Ly$alpha$ spectrum and its surface brightness profile. In this work, we focus on a less-used observable capable of probing any scattering process: polarization. We implement the density matrix formalism of polarization into the Monte Carlo radiative transfer code tlac. This allows us to treat it as a quantum mechanical process where single photons develop and lose polarization from scatterings in arbitrary gas geometries. We explore static and expanding ellipsoids, biconical outflows, and clumpy multiphase media. We find that photons become increasingly polarized as they scatter and diffuse into the wings of the line profiles, making scattered Ly$alpha$ polarized in general. The degree and orientation of Ly$alpha$ polarization depends on the kinematics and distribution of the scattering HI gas. We find that it generally probes spatial or velocity space asymmetries and aligns itself tangentially to the emission source. We show that the mentioned observables, when studied separately, can leave similar signatures for different source models. We conclude by revealing how a joint analysis of the Ly$alpha$ spectra, surface brightness profiles, and polarization can break these degeneracies and help us extract unique physical information on galaxies and their environments from their strongest, most prominent emission line.
Existing long-tailed recognition methods, aiming to train class-balance models from long-tailed data, generally assume the models would be evaluated on the uniform test class distribution. However, the practical test class distribution often violates such an assumption (e.g., being long-tailed or even inversely long-tailed), which would lead existing methods to fail in real-world applications. In this work, we study a more practical task setting, called test-agnostic long-tailed recognition, where the training class distribution is long-tailed while the test class distribution is unknown and can be skewed arbitrarily. In addition to the issue of class imbalance, this task poses another challenge: the class distribution shift between the training and test samples is unidentified. To address this task, we propose a new method, called Test-time Aggregating Diverse Experts (TADE), that presents two solution strategies: (1) a novel skill-diverse expert learning strategy that trains diverse experts to excel at handling different test distributions from a single long-tailed training distribution; (2) a novel test-time expert aggregation strategy that leverages self-supervision to aggregate multiple experts for handling various test distributions. Moreover, we theoretically show that our method has provable ability to simulate unknown test class distributions. Promising results on both vanilla and test-agnostic long-tailed recognition verify the effectiveness of TADE. Code is available at https://github.com/Vanint/TADE-AgnosticLT.
As tons of photos are being uploaded to public websites (e.g., Flickr, Bing, and Google) every day, learning from web data has become an increasingly popular research direction because of freely available web resources, which is also referred to as webly supervised learning. Nevertheless, the performance gap between webly supervised learning and traditional supervised learning is still very large, owning to the label noise of web data. To be exact, the labels of images crawled from public websites are very noisy and often inaccurate. Some existing works tend to facilitate learning from web data with the aid of extra information, such as augmenting or purifying web data by virtue of instance-level supervision, which is usually in demand of heavy manual annotation. Instead, we propose to tackle the label noise by leveraging more accessible category-level supervision. In particular, we build our method upon variational autoencoder (VAE), in which the classification network is attached on the hidden layer of VAE in a way that the classification network and VAE can jointly leverage the category-level hybrid semantic information. The effectiveness of our proposed method is clearly demonstrated by extensive experiments on three benchmark datasets.
Recently, generative data-free quantization emerges as a practical approach that compresses the neural network to low bit-width without access to real data. It generates data to quantize the network by utilizing the batch normalization (BN) statistics of its full-precision counterpart. However, our study shows that in practice, the synthetic data completely constrained by BN statistics suffers severe homogenization at distribution and sample level, which causes serious accuracy degradation of the quantized network. This paper presents a generic Diverse Sample Generation (DSG) scheme for the generative data-free post-training quantization and quantization-aware training, to mitigate the detrimental homogenization. In our DSG, we first slack the statistics alignment for features in the BN layer to relax the distribution constraint. Then we strengthen the loss impact of the specific BN layer for different samples and inhibit the correlation among samples in the generation process, to diversify samples from the statistical and spatial perspective, respectively. Extensive experiments show that for large-scale image classification tasks, our DSG can consistently outperform existing data-free quantization methods on various neural architectures, especially under ultra-low bit-width (e.g., 22% gain under W4A4 setting). Moreover, data diversifying caused by our DSG brings a general gain in various quantization methods, demonstrating diversity is an important property of high-quality synthetic data for data-free quantization.