ﻻ يوجد ملخص باللغة العربية
We introduce Tuna, a static analysis approach to optimizing deep neural network programs. The optimization of tensor operations such as convolutions and matrix multiplications is the key to improving the performance of deep neural networks. Many deep learning model optimization mechanisms today use dynamic analysis, which relies on experimental execution on a target device to build a data-driven cost model of the program. The reliance on dynamic profiling not only requires access to target hardware at compilation time but also incurs significant cost in machine resources. We introduce an approach that profiles the program by constructing features based on the target hardware characteristics in order. We use static analysis of the relative performance of tensor operations to optimize the deep learning program. Experiments show that our approach can achieve up to 11x performance compared to dynamic profiling based methods with the same compilation time.
We propose a novel Bayesian neural network architecture that can learn invariances from data alone by inferring a posterior distribution over different weight-sharing schemes. We show that our model outperforms other non-invariant architectures, when
In this paper we establish a connection between non-convex optimization methods for training deep neural networks and nonlinear partial differential equations (PDEs). Relaxation techniques arising in statistical physics which have already been used s
Neural personalized recommendation is the corner-stone of a wide collection of cloud services and products, constituting significant compute demand of the cloud infrastructure. Thus, improving the execution efficiency of neural recommendation directl
With the increasing popularity of deep learning, Convolutional Neural Networks (CNNs) have been widely applied in various domains, such as image classification and object detection, and achieve stunning success in terms of their high accuracy over th
Edge computing offers an additional layer of compute infrastructure closer to the data source before raw data from privacy-sensitive and performance-critical applications is transferred to a cloud data center. Deep Neural Networks (DNNs) are one clas