ترغب بنشر مسار تعليمي؟ اضغط هنا

Recent advancements in deep neural networks have made remarkable leap-forwards in dense image prediction. However, the issue of feature alignment remains as neglected by most existing approaches for simplicity. Direct pixel addition between upsampled and local features leads to feature maps with misaligned contexts that, in turn, translate to mis-classifications in prediction, especially on object boundaries. In this paper, we propose a feature alignment module that learns transformation offsets of pixels to contextually align upsampled higher-level features; and another feature selection module to emphasize the lower-level features with rich spatial details. We then integrate these two modules in a top-down pyramidal architecture and present the Feature-aligned Pyramid Network (FaPN). Extensive experimental evaluations on four dense prediction tasks and four datasets have demonstrated the efficacy of FaPN, yielding an overall improvement of 1.2 - 2.6 points in AP / mIoU over FPN when paired with Faster / Mask R-CNN. In particular, our FaPN achieves the state-of-the-art of 56.7% mIoU on ADE20K when integrated within Mask-Former. The code is available from https://github.com/EMI-Group/FaPN.
In literature computer architectures are frequently claimed to be highly flexible, typically implying there exist trade-offs between flexibility and performance or energy efficiency. Processor flexibility, however, is not very sharply defined, and as such these claims can not be validated, nor can such hypothetical relations be fully understood and exploited in the design of computing systems. This paper is an attempt to introduce scientific rigour to the notion of flexibility in computing systems.
Despite the remarkable successes of Convolutional Neural Networks (CNNs) in computer vision, it is time-consuming and error-prone to manually design a CNN. Among various Neural Architecture Search (NAS) methods that are motivated to automate designs of high-performance CNNs, the differentiable NAS and population-based NAS are attracting increasing interests due to their unique characters. To benefit from the merits while overcoming the deficiencies of both, this work proposes a novel NAS method, RelativeNAS. As the key to efficient search, RelativeNAS performs joint learning between fast-learners (i.e. networks with relatively higher accuracy) and slow-learners in a pairwise manner. Moreover, since RelativeNAS only requires low-fidelity performance estimation to distinguish each pair of fast-learner and slow-learner, it saves certain computation costs for training the candidate architectures. The proposed RelativeNAS brings several unique advantages: (1) it achieves state-of-the-art performance on ImageNet with top-1 error rate of 24.88%, i.e. outperforming DARTS and AmoebaNet-B by 1.82% and 1.12% respectively; (2) it spends only nine hours with a single 1080Ti GPU to obtain the discovered cells, i.e. 3.75x and 7875x faster than DARTS and AmoebaNet respectively; (3) it provides that the discovered cells obtained on CIFAR-10 can be directly transferred to object detection, semantic segmentation, and keypoint detection, yielding competitive results of 73.1% mAP on PASCAL VOC, 78.7% mIoU on Cityscapes, and 68.5% AP on MSCOCO, respectively. The implementation of RelativeNAS is available at https://github.com/EMI-Group/RelativeNAS
Despite significant advances in image-to-image (I2I) translation with Generative Adversarial Networks (GANs) have been made, it remains challenging to effectively translate an image to a set of diverse images in multiple target domains using a pair o f generator and discriminator. Existing multimodal I2I translation methods adopt multiple domain-specific content encoders for different domains, where each domain-specific content encoder is trained with images from the same domain only. Nevertheless, we argue that the content (domain-invariant) features should be learned from images among all the domains. Consequently, each domain-specific content encoder of existing schemes fails to extract the domain-invariant features efficiently. To address this issue, we present a flexible and general SoloGAN model for efficient multimodal I2I translation among multiple domains with unpaired data. In contrast to existing methods, the SoloGAN algorithm uses a single projection discriminator with an additional auxiliary classifier, and shares the encoder and generator for all domains. As such, the SoloGAN model can be trained effectively with images from all domains such that the domain-invariant content representation can be efficiently extracted. Qualitative and quantitative results over a wide range of datasets against several counterparts and variants of the SoloGAN model demonstrate the merits of the method, especially for the challenging I2I translation tasks, i.e., tasks that involve extreme shape variations or need to keep the complex backgrounds unchanged after translations. Furthermore, we demonstrate the contribution of each component using ablation studies.
Recently, increasing works have proposed to drive evolutionary algorithms using machine learning models. Usually, the performance of such model based evolutionary algorithms is highly dependent on the training qualities of the adopted models. Since i t usually requires a certain amount of data (i.e. the candidate solutions generated by the algorithms) for model training, the performance deteriorates rapidly with the increase of the problem scales, due to the curse of dimensionality. To address this issue, we propose a multi-objective evolutionary algorithm driven by the generative adversarial networks (GANs). At each generation of the proposed algorithm, the parent solutions are first classified into real and fake samples to train the GANs; then the offspring solutions are sampled by the trained GANs. Thanks to the powerful generative ability of the GANs, our proposed algorithm is capable of generating promising offspring solutions in high-dimensional decision space with limited training data. The proposed algorithm is tested on 10 benchmark problems with up to 200 decision variables. Experimental results on these test problems demonstrate the effectiveness of the proposed algorithm.
Recently, more and more works have proposed to drive evolutionary algorithms using machine learning models.Usually, the performance of such model based evolutionary algorithms is highly dependent on the training qualities of the adopted models.Since it usually requires a certain amount of data (i.e. the candidate solutions generated by the algorithms) for model training, the performance deteriorates rapidly with the increase of the problem scales, due to the curse of dimensionality.To address this issue, we propose a multi-objective evolutionary algorithm driven by the generative adversarial networks (GANs).At each generation of the proposed algorithm, the parent solutions are first classified into emph{real} and emph{fake} samples to train the GANs; then the offspring solutions are sampled by the trained GANs.Thanks to the powerful generative ability of the GANs, our proposed algorithm is capable of generating promising offspring solutions in high-dimensional decision space with limited training data.The proposed algorithm is tested on 10 benchmark problems with up to 200 decision variables.Experimental results on these test problems demonstrate the effectiveness of the proposed algorithm.
64 - Shihua Huang , Lu Wang 2019
Driven by Convolutional Neural Networks, object detection and semantic segmentation have gained significant improvements. However, existing methods on the basis of a full top-down module have limited robustness in handling those two tasks simultaneou sly. To this end, we present a joint multi-task framework, termed IvaNet. Different from existing methods, our IvaNet backwards abstract semantic information from higher layers to augment lower layers using local top-down modules. The comparisons against some counterparts on the PASCAL VOC and MS COCO datasets demonstrate the functionality of IvaNet.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا