Transferable Graph Optimizers for ML Compilers

77 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yanqi Zhou

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Yanqi Zhou - Sudip Roy - Amirali Abdolrashidi

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Most compilers for machine learning (ML) frameworks need to solve many correlated optimization problems to generate efficient machine code. Current ML compilers rely on heuristics based algorithms to solve these optimization problems one at a time. However, this approach is not only hard to maintain but often leads to sub-optimal solutions especially for newer model architectures. Existing learning based approaches in the literature are sample inefficient, tackle a single optimization problem, and do not generalize to unseen graphs making them infeasible to be deployed in practice. To address these limitations, we propose an end-to-end, transferable deep reinforcement learning method for computational graph optimization (GO), based on a scalable sequential attention mechanism over an inductive graph neural network. GO generates decisions on the entire graph rather than on each individual node autoregressively, drastically speeding up the search compared to prior methods. Moreover, we propose recurrent attention layers to jointly optimize dependent graph optimization tasks and demonstrate 33%-60% speedup on three graph optimization tasks compared to TensorFlow default optimization. On a diverse set of representative graphs consisting of up to 80,000 nodes, including Inception-v3, Transformer-XL, and WaveNet, GO achieves on average 21% improvement over human experts and 18% improvement over the prior state of the art with 15x faster convergence, on a device placement task evaluated in real systems.

قيم البحث

اقرأ أيضاً

Learning Transferable Graph Exploration

133 - Hanjun Dai , Yujia Li , Chenglong Wang 2019

This paper considers the problem of efficient exploration of unseen environments, a key challenge in AI. We propose a `learning to explore framework where we learn a policy from a distribution of environments. At test time, presented with an unseen e nvironment from the same distribution, the policy aims to generalize the exploration strategy to visit the maximum number of unique states in a limited number of steps. We particularly focus on environments with graph-structured state-spaces that are encountered in many important real-world applications like software testing and map building. We formulate this task as a reinforcement learning problem where the `exploration agent is rewarded for transitioning to previously unseen environment states and employ a graph-structured memory to encode the agents past trajectory. Experimental results demonstrate that our approach is extremely effective for exploration of spatial maps; and when applied on the challenging problems of coverage-guided software-testing of domain-specific programs and real-world mobile applications, it outperforms methods that have been hand-engineered by human experts.

التعلم الآلي التعلم الالي

A Unified Transferable Model for ML-Enhanced DBMS

251 - Ziniu Wu , Peilun Yang , Pei Yu 2021

Recently, the database management system (DBMS) community has witnessed the power of machine learning (ML) solutions for DBMS tasks. Despite their promising performance, these existing solutions can hardly be considered satisfactory. First, these ML- based methods in DBMS are not effective enough because they are optimized on each specific task, and cannot explore or understand the intrinsic connections between tasks. Second, the training process has serious limitations that hinder their practicality, because they need to retrain the entire model from scratch for a new DB. Moreover, for each retraining, they require an excessive amount of training data, which is very expensive to acquire and unavailable for a new DB. We propose to explore the transferabilities of the ML methods both across tasks and across DBs to tackle these fundamental drawbacks. In this paper, we propose a unified model MTMLF that uses a multi-task training procedure to capture the transferable knowledge across tasks and a pre-train fine-tune procedure to distill the transferable meta knowledge across DBs. We believe this paradigm is more suitable for cloud DB service, and has the potential to revolutionize the way how ML is used in DBMS. Furthermore, to demonstrate the predicting power and viability of MTMLF, we provide a concrete and very promising case study on query optimization tasks. Last but not least, we discuss several concrete research opportunities along this line of work.

قواعد البيانات الذكاء الاصطناعي

TGG: Transferable Graph Generation for Zero-shot and Few-shot Learning

433 - Chenrui Zhang , Xiaoqing Lyu , Zhi Tang 2019

Zero-shot and few-shot learning aim to improve generalization to unseen concepts, which are promising in many realistic scenarios. Due to the lack of data in unseen domain, relation modeling between seen and unseen domains is vital for knowledge tran sfer in these tasks. Most existing methods capture seen-unseen relation implicitly via semantic embedding or feature generation, resulting in inadequate use of relation and some issues remain (e.g. domain shift). To tackle these challenges, we propose a Transferable Graph Generation (TGG) approach, in which the relation is modeled and utilized explicitly via graph generation. Specifically, our proposed TGG contains two main components: (1) Graph generation for relation modeling. An attention-based aggregate network and a relation kernel are proposed, which generate instance-level graph based on a class-level prototype graph and visual features. Proximity information aggregating is guided by a multi-head graph attention mechanism, where seen and unseen features synthesized by GAN are revised as node embeddings. The relation kernel further generates edges with GCN and graph kernel method, to capture instance-level topological structure while tackling data imbalance and noise. (2) Relation propagation for relation utilization. A dual relation propagation approach is proposed, where relations captured by the generated graph are separately propagated from the seen and unseen subgraphs. The two propagations learn from each other in a dual learning fashion, which performs as an adaptation way for mitigating domain shift. All components are jointly optimized with a meta-learning strategy, and our TGG acts as an end-to-end framework unifying conventional zero-shot, generalized zero-shot and few-shot learning. Extensive experiments demonstrate that it consistently surpasses existing methods of the above three fields by a significant margin.

التعلم الآلي التعلم الالي

Learned Optimizers for Analytic Continuation

85 - Dongchen Huang , Yi-feng Yang 2021

Traditional maximum entropy and sparsity-based algorithms for analytic continuation often suffer from the ill-posed kernel matrix or demand tremendous computation time for parameter tuning. Here we propose a neural network method by convex optimizati on and replace the ill-posed inverse problem by a sequence of well-conditioned surrogate problems. After training, the learned optimizers are able to give a solution of high quality with low time cost and achieve higher parameter efficiency than heuristic full-connected networks. The output can also be used as a neural default model to improve the maximum entropy for better performance. Our methods may be easily extended to other high-dimensional inverse problems via large-scale pretraining.

التعلم الآلي الميكانيكا الإحصائية الإلكترونات المرتبطة بشدة

Training Learned Optimizers with Randomly Initialized Learned Optimizers

240 - Luke Metz , C. Daniel Freeman , Niru Maheswaranathan 2021

Learned optimizers are increasingly effective, with performance exceeding that of hand designed optimizers such as Adam~citep{kingma2014adam} on specific tasks citep{metz2019understanding}. Despite the potential gains available, in current work the m eta-training (or `outer-training) of the learned optimizer is performed by a hand-designed optimizer, or by an optimizer trained by a hand-designed optimizer citep{metz2020tasks}. We show that a population of randomly initialized learned optimizers can be used to train themselves from scratch in an online fashion, without resorting to a hand designed optimizer in any part of the process. A form of population based training is used to orchestrate this self-training. Although the randomly initialized optimizers initially make slow progress, as they improve they experience a positive feedback loop, and become rapidly more effective at training themselves. We believe feedback loops of this type, where an optimizer improves itself, will be important and powerful in the future of machine learning. These methods not only provide a path towards increased performance, but more importantly relieve research and engineering effort.

التعلم الآلي الحوسبة العصبية والتطورية