ترغب بنشر مسار تعليمي؟ اضغط هنا

Large scale graph optimization problems arise in many fields. This paper presents an extensible, high performance framework (named OpenGraphGym-MG) that uses deep reinforcement learning and graph embedding to solve large graph optimization problems w ith multiple GPUs. The paper uses a common RL algorithm (deep Q-learning) and a representative graph embedding (structure2vec) to demonstrate the extensibility of the framework and, most importantly, to illustrate the novel optimization techniques, such as spatial parallelism, graph-level and node-level batched processing, distributed sparse graph storage, efficient parallel RL training and inference algorithms, repeated gradient descent iterations, and adaptive multiple-node selections. This study performs a comprehensive performance analysis on parallel efficiency and memory cost that proves the parallel RL training and inference algorithms are efficient and highly scalable on a number of GPUs. This study also conducts a range of large graph experiments, with both generated graphs (over 30 million edges) and real-world graphs, using a single compute node (with six GPUs) of the Summit supercomputer. Good scalability in both RL training and inference is achieved: as the number of GPUs increases from one to six, the time of a single step of RL training and a single step of RL inference on large graphs with more than 30 million edges, is reduced from 316.4s to 54.5s, and 23.8s to 3.4s, respectively. The research results on a single node lay out a solid foundation for the future work to address graph optimization problems with a large number of GPUs across multiple nodes in the Summit.
127 - Yuankun Fu , Fengguang Song 2020
This chapter introduces the state-of-the-art in the emerging area of combining High Performance Computing (HPC) with Big Data Analysis. To understand the new area, the chapter first surveys the existing approaches to integrating HPC with Big Data. Ne xt, the chapter introduces several optimization solutions that focus on how to minimize the data transfer time from computation-intensive applications to analysis-intensive applications as well as minimizing the end-to-end time-to-solution. The solutions utilize SDN to adaptively use both high speed interconnect network and high performance parallel file systems to optimize the application performance. A computational framework called DataBroker is designed and developed to enable a tight integration of HPC with data analysis. Multiple types of experiments have been conducted to show different performance issues in both message passing and parallel file systems and to verify the effectiveness of the proposed research approaches.
The amount of large-scale scientific computing software is dramatically increasing. In this work, we designed a new language, named feature query language (FQL), to collect and extract software features from a quick static code analysis. We designed and implemented an FQL toolkit to automatically detect and present the software features using an extensible query repository. Several large-scale, high performance computing (HPC) scientific codes have been used in the paper to demonstrate the HPC-related feature extraction and information collection. Although we emphasized the HPC features in the study, the toolkit can be easily extended to answer general software feature questions, such as coding pattern and hardware dependency.
127 - Hyuntae Na , Guang Song , 2015
It is shown that the density of modes of the vibrational spectrum of globular proteins is universal, i.e., regardless of the protein in question it closely follows one universal curve. The present study, including 135 proteins analyzed with a full at omic empirical potential (CHARMM22) and using the full complement of all atoms Cartesian degrees of freedom, goes far beyond previous claims of universality, confirming that universality holds even in the high-frequency range (300- 4000 1/cm), where peaks and turns in the density of states are faithfully reproduced from one protein to the next. We also characterize fluctuations of the spectral density from the average, paving the way to a meaningful discussion of rare, unusual spectra and the structural reasons for the deviations in such outlier proteins. Since the method used for the derivation of the vibrational modes (potential energy formulation, set of degrees of freedom employed, etc.) has a dramatic effect on the spectral density, another significant implication of our findings is that the universality can provide an exquisite tool for assessing and improving the quality of various models used for NMA computations. Finally, we show that the input configuration too affects the density of modes, thus emphasizing the importance of simplified potential energy formulations that are minimized at the outset.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا