ترغب بنشر مسار تعليمي؟ اضغط هنا

An important linear algebra routine, GEneral Matrix Multiplication (GEMM), is a fundamental operator in deep learning. Compilers need to translate these routines into low-level code optimized for specific hardware. Compiler-level optimization of GEMM has significant performance impact on training and executing deep learning models. However, most deep learning frameworks rely on hardware-specific operator libraries in which GEMM optimization has been mostly achieved by manual tuning, which restricts the performance on different target hardware. In this paper, we propose two novel algorithms for GEMM optimization based on the TVM framework, a lightweight Greedy Best First Search (G-BFS) method based on heuristic search, and a Neighborhood Actor Advantage Critic (N-A2C) method based on reinforcement learning. Experimental results show significant performance improvement of the proposed methods, in both the optimality of the solution and the cost of search in terms of time and fraction of the search space explored. Specifically, the proposed methods achieve 24% and 40% savings in GEMM computation time over state-of-the-art XGBoost and RNN methods, respectively, while exploring only 0.1% of the search space. The proposed approaches have potential to be applied to other operator-level optimizations.
Fog computing is a promising architecture to provide economic and low latency data services for future Internet of things (IoT)-based network systems. It relies on a set of low-power fog nodes that are close to the end users to offload the services o riginally targeting at cloud data centers. In this paper, we consider a specific fog computing network consisting of a set of data service operators (DSOs) each of which controls a set of fog nodes to provide the required data service to a set of data service subscribers (DSSs). How to allocate the limited computing resources of fog nodes (FNs) to all the DSSs to achieve an optimal and stable performance is an important problem. In this paper, we propose a joint optimization framework for all FNs, DSOs and DSSs to achieve the optimal resource allocation schemes in a distributed fashion. In the framework, we first formulate a Stackelberg game to analyze the pricing problem for the DSOs as well as the resource allocation problem for the DSSs. Under the scenarios that the DSOs can know the expected amount of resource purchased by the DSSs, a many-to-many matching game is applied to investigate the pairing problem between DSOs and FNs. Finally, within the same DSO, we apply another layer of many-to-many matching between each of the paired FNs and serving DSSs to solve the FN-DSS pairing problem. Simulation results show that our proposed framework can significantly improve the performance of the IoT-based network systems.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا