No Arabic abstract
Accuracy predictor is trained to predict the validation accuracy of an network from its architecture encoding. It can effectively assist in designing networks and improving Neural Architecture Search(NAS) efficiency. However, a high-performance predictor depends on adequate trainning samples, which requires unaffordable computation overhead. To alleviate this problem, we propose a novel framework to train an accuracy predictor under few training samples. The framework consists ofdata augmentation methods and an ensemble learning algorithm. The data augmentation methods calibrate weak labels and inject noise to feature space. The ensemble learning algorithm, termed cascade bagging, trains two-level models by sampling data and features. In the end, the advantages of above methods are proved in the Performance Prediciton Track of CVPR2021 1st Lightweight NAS Challenge. Our code is made public at: https://github.com/dlongry/Solutionto-CVPR2021-NAS-Track2.
Link scheduling in device-to-device (D2D) networks is usually formulated as a non-convex combinatorial problem, which is generally NP-hard and difficult to get the optimal solution. Traditional methods to solve this problem are mainly based on mathematical optimization techniques, where accurate channel state information (CSI), usually obtained through channel estimation and feedback, is needed. To overcome the high computational complexity of the traditional methods and eliminate the costly channel estimation stage, machine leaning (ML) has been introduced recently to address the wireless link scheduling problems. In this paper, we propose a novel graph embedding based method for link scheduling in D2D networks. We first construct a fully-connected directed graph for the D2D network, where each D2D pair is a node while interference links among D2D pairs are the edges. Then we compute a low-dimensional feature vector for each node in the graph. The graph embedding process is based on the distances of both communication and interference links, therefore without requiring the accurate CSI. By utilizing a multi-layer classifier, a scheduling strategy can be learned in a supervised manner based on the graph embedding results for each node. We also propose an unsupervised manner to train the graph embedding based method to further reinforce the scalability and generalizability and develop a K-nearest neighbor graph representation method to reduce the computational complexity. Extensive simulation demonstrates that the proposed method is near-optimal compared with the existing state-of-art methods but is with only hundreds of training samples. It is also competitive in terms of scalability and generalizability to more complicated scenarios.
Few-Shot Learning (FSL) aims to improve a models generalization capability in low data regimes. Recent FSL works have made steady progress via metric learning, meta learning, representation learning, etc. However, FSL remains challenging due to the following longstanding difficulties. 1) The seen and unseen classes are disjoint, resulting in a distribution shift between training and testing. 2) During testing, labeled data of previously unseen classes is sparse, making it difficult to reliably extrapolate from labeled support examples to unlabeled query examples. To tackle the first challenge, we introduce Hybrid Consistency Training to jointly leverage interpolation consistency, including interpolating hidden features, that imposes linear behavior locally and data augmentation consistency that learns robust embeddings against sample variations. As for the second challenge, we use unlabeled examples to iteratively normalize features and adapt prototypes, as opposed to commonly used one-time update, for more reliable prototype-based transductive inference. We show that our method generates a 2% to 5% improvement over the state-of-the-art methods with similar backbones on five FSL datasets and, more notably, a 7% to 8% improvement for more challenging cross-domain FSL.
Scene text recognition (STR) is still a hot research topic in computer vision field due to its various applications. Existing works mainly focus on learning a general model with a huge number of synthetic text images to recognize unconstrained scene texts, and have achieved substantial progress. However, these methods are not quite applicable in many real-world scenarios where 1) high recognition accuracy is required, while 2) labeled samples are lacked. To tackle this challenging problem, this paper proposes a few-shot adversarial sequence domain adaptation (FASDA) approach to build sequence adaptation between the synthetic source domain (with many synthetic labeled samples) and a specific target domain (with only some or a few real labeled samples). This is done by simultaneously learning each characters feature representation with an attention mechanism and establishing the corresponding character-level latent subspace with adversarial learning. Our approach can maximize the character-level confusion between the source domain and the target domain, thus achieves the sequence-level adaptation with even a small number of labeled samples in the target domain. Extensive experiments on various datasets show that our method significantly outperforms the finetuning scheme, and obtains comparable performance to the state-of-the-art STR methods.
The challenges of high intra-class variance yet low inter-class fluctuations in fine-grained visual categorization are more severe with few labeled samples, textit{i.e.,} Fine-Grained categorization problems under the Few-Shot setting (FGFS). High-order features are usually developed to uncover subtle differences between sub-categories in FGFS, but they are less effective in handling the high intra-class variance. In this paper, we propose a Target-Oriented Alignment Network (TOAN) to investigate the fine-grained relation between the target query image and support classes. The feature of each support image is transformed to match the query ones in the embedding feature space, which reduces the disparity explicitly within each category. Moreover, different from existing FGFS approaches devise the high-order features over the global image with less explicit consideration of discriminative parts, we generate discriminative fine-grained features by integrating compositional concept representations to global second-order pooling. Extensive experiments are conducted on four fine-grained benchmarks to demonstrate the effectiveness of TOAN compared with the state-of-the-art models.
Surface defect detection plays an increasingly important role in manufacturing industry to guarantee the product quality. Many deep learning methods have been widely used in surface defect detection tasks, and have been proven to perform well in defects classification and location. However, deep learning-based detection methods often require plenty of data for training, which fail to apply to the real industrial scenarios since the distribution of defect categories is often imbalanced. In other words, common defect classes have many samples but rare defect classes have extremely few samples, and it is difficult for these methods to well detect rare defect classes. To solve the imbalanced distribution problem, in this paper we propose TL-SDD: a novel Transfer Learning-based method for Surface Defect Detection. First, we adopt a two-phase training scheme to transfer the knowledge from common defect classes to rare defect classes. Second, we propose a novel Metric-based Surface Defect Detection (M-SDD) model. We design three modules for this model: (1) feature extraction module: containing feature fusion which combines high-level semantic information with low-level structural information. (2) feature reweighting module: transforming examples to a reweighting vector that indicates the importance of features. (3) distance metric module: learning a metric space in which defects are classified by computing distances to representations of each category. Finally, we validate the performance of our proposed method on a real dataset including surface defects of aluminum profiles. Compared to the baseline methods, the performance of our proposed method has improved by up to 11.98% for rare defect classes.