Controllable Orthogonalization in Training DNNs

61 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Lei Huang

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Lei Huang - Li Liu - Fan Zhu

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Orthogonality is widely used for training deep neural networks (DNNs) due to its ability to maintain all singular values of the Jacobian close to 1 and reduce redundancy in representation. This paper proposes a computationally efficient and numerically stable orthogonalization method using Newtons iteration (ONI), to learn a layer-wise orthogonal weight matrix in DNNs. ONI works by iteratively stretching the singular values of a weight matrix towards 1. This property enables it to control the orthogonality of a weight matrix by its number of iterations. We show that our method improves the performance of image classification networks by effectively controlling the orthogonality to provide an optimal tradeoff between optimization benefits and representational capacity reduction. We also show that ONI stabilizes the training of generative adversarial networks (GANs) by maintaining the Lipschitz continuity of a network, similar to spectral normalization (SN), and further outperforms SN by providing controllable orthogonality.

قيم البحث

285 - Rohit Saha , Brendan Duke , Florian Shkurti 2021

Hairstyle transfer is challenging due to hair structure differences in the source and target hair. Therefore, we propose Latent Optimization of Hairstyles via Orthogonalization (LOHO), an optimization-based approach using GAN inversion to infill miss ing hair structure details in latent space during hairstyle transfer. Our approach decomposes hair into three attributes: perceptual structure, appearance, and style, and includes tailored losses to model each of these attributes independently. Furthermore, we propose two-stage optimization and gradient orthogonalization to enable disentangled latent space optimization of our hair attributes. Using LOHO for latent space manipulation, users can synthesize novel photorealistic images by manipulating hair attributes either individually or jointly, transferring the desired attributes from reference hairstyles. LOHO achieves a superior FID compared with the current state-of-the-art (SOTA) for hairstyle transfer. Additionally, LOHO preserves the subjects identity comparably well according to PSNR and SSIM when compared to SOTA image embedding pipelines. Code is available at https://github.com/dukebw/LOHO.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Training DNNs in O(1) memory with MEM-DFA using Random Matrices

244 - Tien Chu , Kamil Mykitiuk , Miron Szewczyk 2020

This work presents a method for reducing memory consumption to a constant complexity when training deep neural networks. The algorithm is based on the more biologically plausible alternatives of the backpropagation (BP): direct feedback alignment (DF A) and feedback alignment (FA), which use random matrices to propagate error. The proposed method, memory-efficient direct feedback alignment (MEM-DFA), uses higher independence of layers in DFA and allows avoiding storing at once all activation vectors, unlike standard BP, FA, and DFA. Thus, our algorithms memory usage is constant regardless of the number of layers in a neural network. The method increases the computational cost only by a constant factor of one extra forward pass. The MEM-DFA, BP, FA, and DFA were evaluated along with their memory profiles on MNIST and CIFAR-10 datasets on various neural network models. Our experiments agree with our theoretical results and show a significant decrease in the memory cost of MEM-DFA compared to the other algorithms.

الرؤية الحاسوبية وتمييز الأنماط

From Quantized DNNs to Quantizable DNNs

231 - Kunyuan Du , Ya Zhang , Haibing Guan 2020

This paper proposes Quantizable DNNs, a special type of DNNs that can flexibly quantize its bit-width (denoted as `bit modes thereafter) during execution without further re-training. To simultaneously optimize for all bit modes, a combinational loss of all bit modes is proposed, which enforces consistent predictions ranging from low-bit mode to 32-bit mode. This Consistency-based Loss may also be viewed as certain form of regularization during training. Because outputs of matrix multiplication in different bit modes have different distributions, we introduce Bit-Specific Batch Normalization so as to reduce conflicts among different bit modes. Experiments on CIFAR100 and ImageNet have shown that compared to quantized DNNs, Quantizable DNNs not only have much better flexibility, but also achieve even higher classification accuracy. Ablation studies further verify that the regularization through the consistency-based loss indeed improves the models generalization performance.

الرؤية الحاسوبية وتمييز الأنماط

Training DNNs with Hybrid Block Floating Point

73 - Mario Drumond , Tao Lin , Martin Jaggi 2018

The wide adoption of DNNs has given birth to unrelenting computing requirements, forcing datacenter operators to adopt domain-specific accelerators to train them. These accelerators typically employ densely packed full precision floating-point arithm etic to maximize performance per area. Ongoing research efforts seek to further increase that performance density by replacing floating-point with fixed-point arithmetic. However, a significant roadblock for these attempts has been fixed points narrow dynamic range, which is insufficient for DNN training convergence. We identify block floating point (BFP) as a promising alternative representation since it exhibits wide dynamic range and enables the majority of DNN operations to be performed with fixed-point logic. Unfortunately, BFP alone introduces several limitations that preclude its direct applicability. In this work, we introduce HBFP, a hybrid BFP-FP approach, which performs all dot products in BFP and other operations in floating point. HBFP delivers the best of both worlds: the high accuracy of floating point at the superior hardware density of fixed point. For a wide variety of models, we show that HBFP matches floating points accuracy while enabling hardware implementations that deliver up to 8.5x higher throughput.

التعلم الآلي التحليل العددي التعلم الالي

CONFIG: Controllable Neural Face Image Generation

119 - Marek Kowalski , Stephan J. Garbin , Virginia Estellers 2020

Our ability to sample realistic natural images, particularly faces, has advanced by leaps and bounds in recent years, yet our ability to exert fine-tuned control over the generative process has lagged behind. If this new technology is to find practic al uses, we need to achieve a level of control over generative networks which, without sacrificing realism, is on par with that seen in computer graphics and character animation. To this end we propose ConfigNet, a neural face model that allows for controlling individual aspects of output images in semantically meaningful ways and that is a significant step on the path towards finely-controllable neural rendering. ConfigNet is trained on real face images as well as synthetic face renders. Our novel method uses synthetic data to factorize the latent space into elements that correspond to the inputs of a traditional rendering pipeline, separating aspects such as head pose, facial expression, hair style, illumination, and many others which are very hard to annotate in real data. The real images, which are presented to the network without labels, extend the variety of the generated images and encourage realism. Finally, we propose an evaluation criterion using an attribute detection network combined with a user study and demonstrate state-of-the-art individual control over attributes in the output images.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي