Sub-pixel face landmarks using heatmaps and a bag of tricks

138 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Samuel Earp

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Samuel W. F. Earp - Aubin Samacoits - Sanjana Jain

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Accurate face landmark localization is an essential part of face recognition, reconstruction and morphing. To accurately localize face landmarks, we present our heatmap regression approach. Each model consists of a MobileNetV2 backbone followed by several upscaling layers, with different tricks to optimize both performance and inference cost. We use five naive face landmarks from a publicly available face detector to position and align the face instead of using the bounding box like traditional methods. Moreover, we show by adding random rotation, displacement and scaling -- after alignment -- that the model is more sensitive to the face position than orientation. We also show that it is possible to reduce the upscaling complexity by using a mixture of deconvolution and pixel-shuffle layers without impeding localization performance. We present our state-of-the-art face landmark localization model (ranking second on The 2nd Grand Challenge of 106-Point Facial Landmark Localization validation set). Finally, we test the effect on face recognition using these landmarks, using a publicly available model and benchmarks.

قيم البحث

111 - Thomas Elsken , Benedikt Staffler , Arber Zela 2021

While neural architecture search methods have been successful in previous years and led to new state-of-the-art performance on various problems, they have also been criticized for being unstable, being highly sensitive with respect to their hyperpara meters, and often not performing better than random search. To shed some light on this issue, we discuss some practical considerations that help improve the stability, efficiency and overall performance.

التعلم الآلي الذكاء الاصطناعي التعلم الالي

Face Detection with Feature Pyramids and Landmarks

224 - Samuel W. F. Earp , Pavit Noinongyao , Justin A. Cairns 2019

Accurate face detection and facial landmark localization are crucial to any face recognition system. We present a series of three single-stage RCNNs with different sized backbones (MobileNetV2-25, MobileNetV2-100, and ResNet101) and a six-layer featu re pyramid trained exclusively on the WIDER FACE dataset. We compare the face detection and landmark accuracies using eight context module architectures, four proposed by previous research and four modifi

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو

PP-OCRv2: Bag of Tricks for Ultra Lightweight OCR System

223 - Yuning Du , Chenxia Li , Ruoyu Guo 2021

Optical Character Recognition (OCR) systems have been widely used in various of application scenarios. Designing an OCR system is still a challenging task. In previous work, we proposed a practical ultra lightweight OCR system (PP-OCR) to balance the accuracy against the efficiency. In order to improve the accuracy of PP-OCR and keep high efficiency, in this paper, we propose a more robust OCR system, i.e. PP-OCRv2. We introduce bag of tricks to train a better text detector and a better text recognizer, which include Collaborative Mutual Learning (CML), CopyPaste, Lightweight CPUNetwork (LCNet), Unified-Deep Mutual Learning (U-DML) and Enhanced CTCLoss. Experiments on real data show that the precision of PP-OCRv2 is 7% higher than PP-OCR under the same inference cost. It is also comparable to the server models of the PP-OCR which uses ResNet series as backbones. All of the above mentioned models are open-sourced and the code is available in the GitHub repository PaddleOCR which is powered by PaddlePaddle.

الرؤية الحاسوبية وتمييز الأنماط

Bag of Tricks for Adversarial Training

134 - Tianyu Pang , Xiao Yang , Yinpeng Dong 2020

Adversarial training (AT) is one of the most effective strategies for promoting model robustness. However, recent benchmarks show that most of the proposed improvements on AT are less effective than simply early stopping the training procedure. This counter-intuitive fact motivates us to investigate the implementation details of tens of AT methods. Surprisingly, we find that the basic settings (e.g., weight decay, training schedule, etc.) used in these methods are highly inconsistent. In this work, we provide comprehensive evaluations on CIFAR-10, focusing on the effects of mostly overlooked training tricks and hyperparameters for adversarially trained models. Our empirical observations suggest that adversarial robustness is much more sensitive to some basic training settings than we thought. For example, a slightly different value of weight decay can reduce the model robust accuracy by more than 7%, which is probable to override the potential promotion induced by the proposed methods. We conclude a baseline training setting and re-implement previous defenses to achieve new state-of-the-art results. These facts also appeal to more concerns on the overlooked confounders when benchmarking defenses.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط التعلم الالي

Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study

176 - Tianlong Chen , Kaixiong Zhou , Keyu Duan 2021

Training deep graph neural networks (GNNs) is notoriously hard. Besides the standard plights in training deep architectures such as vanishing gradients and overfitting, the training of deep GNNs also uniquely suffers from over-smoothing, information squashing, and so on, which limits their potential power on large-scale graphs. Although numerous efforts are proposed to address these limitations, such as various forms of skip connections, graph normalization, and random dropping, it is difficult to disentangle the advantages brought by a deep GNN architecture from those tricks necessary to train such an architecture. Moreover, the lack of a standardized benchmark with fair and consistent experimental settings poses an almost insurmountable obstacle to gauging the effectiveness of new mechanisms. In view of those, we present the first fair and reproducible benchmark dedicated to assessing the tricks of training deep GNNs. We categorize existing approaches, investigate their hyperparameter sensitivity, and unify the basic configuration. Comprehensive evaluations are then conducted on tens of representative graph datasets including the recent large-scale Open Graph Benchmark (OGB), with diverse deep GNN backbones. Based on synergistic studies, we discover the combo of superior training tricks, that lead us to attain the new state-of-the-art results for deep GCNs, across multiple representative graph datasets. We demonstrate that an organic combo of initial connection, identity mapping, group and batch normalization has the most ideal performance on large datasets. Experiments also reveal a number of surprises when combining or scaling up some of the tricks. All codes are available at https://github.com/VITA-Group/Deep_GCN_Benchmarking.

التعلم الآلي الذكاء الاصطناعي