Scalable Parallel Numerical Methods and Software Tools for Material Design

122 0 0.0 ( 0 )

Download Cite

Added by Ryoichi Kawai

Publication date 1994

fields Physics

and research's language is English

Authors E. Bylaska

Materials Science

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

A new method of solution to the local spin density approximation to the electronic Schr{o}dinger equation is presented. The method is based on an efficient, parallel, adaptive multigrid eigenvalue solver. It is shown that adaptivity is both necessary and sufficient to accurately solve the eigenvalue problem near the singularities at the atomic centers. While preliminary, these results suggest that direct real space methods may provide a much needed method for efficiently computing the forces in complex materials.

rate research

Parallel and Scalable Heat Methods for Geodesic Distance Computation

111 - Jiong Tao , Juyong Zhang , Bailin Deng 2018

In this paper, we propose a parallel and scalable approach for geodesic distance computation on triangle meshes. Our key observation is that the recovery of geodesic distance with the heat method from [Crane et al. 2013] can be reformulated as optimization of its gradients subject to integrability, which can be solved using an efficient first-order method that requires no linear system solving and converges quickly. Afterward, the geodesic distance is efficiently recovered by parallel integration of the optimized gradients in breadth-first order. Moreover, we employ a similar breadth-first strategy to derive a parallel Gauss-Seidel solver for the diffusion step in the heat method. To further lower the memory consumption from gradient optimization on faces, we also propose a formulation that optimizes the projected gradients on edges, which reduces the memory footprint by about 50%. Our approach is trivially parallelizable, with a low memory footprint that grows linearly with respect to the model size. This makes it particularly suitable for handling large models. Experimental results show that it can efficiently compute geodesic distance on meshes with more than 200 million vertices on a desktop PC with 128GB RAM, outperforming the original heat method and other state-of-the-art geodesic distance solvers.

Graphics

Reliable and Explainable Machine Learning Methods for Accelerated Material Discovery

375 - Bhavya Kailkhura , Brian Gallagher , Sookyung Kim 2019

Material scientists are increasingly adopting the use of machine learning (ML) for making potentially important decisions, such as, discovery, development, optimization, synthesis and characterization of materials. However, despite MLs impressive performance in commercial applications, several unique challenges exist when applying ML in materials science applications. In such a context, the contributions of this work are twofold. First, we identify common pitfalls of existing ML techniques when learning from underrepresented/imbalanced material data. Specifically, we show that with imbalanced data, standard methods for assessing quality of ML models break down and lead to misleading conclusions. Furthermore, we found that the models own confidence score cannot be trusted and model introspection methods (using simpler models) do not help as they result in loss of predictive performance (reliability-explainability trade-off). Second, to overcome these challenges, we propose a general-purpose explainable and reliable machine-learning framework. Specifically, we propose a novel pipeline that employs an ensemble of simpler models to reliably predict material properties. We also propose a transfer learning technique and show that the performance loss due to models simplicity can be overcome by exploiting correlations among different material properties. A new evaluation metric and a trust score to better quantify the confidence in the predictions are also proposed. To improve the interpretability, we add a rationale generator component to our framework which provides both model-level and decision-level explanations. Finally, we demonstrate the versatility of our technique on two applications: 1) predicting properties of crystalline compounds, and 2) identifying novel potentially stable solar cell materials.

Computational Physics Materials Science Machine Learning

Flexible and scalable particle-in-cell methods for massively parallel computations

132 - R. Gassmoeller , E. Heien , E. G. Puckett 2016

Particle-in-cell methods couple mesh-based methods for the solution of continuum mechanics problems, with the ability to advect and evolve particles. They have a long history and many applications in scientific computing. However, they have most often only been implemented for either sequential codes, or parallel codes with static meshes that are statically partitioned. In contrast, many mesh-based codes today use adaptively changing, dynamically partitioned meshes, and can scale to thousands or tens of thousands of processors. Consequently, there is a need to revisit the data structures and algorithms necessary to use particle methods with modern, mesh-based methods. Here we review commonly encountered requirements of particle-in-cell methods, and describe efficient ways to implement them in the context of large-scale parallel finite-element codes that use dynamically changing meshes. We also provide practical experience for how to address bottlenecks that impede the efficient implementation of these algorithms and demonstrate with numerical tests both that our algorithms can be implemented with optimal complexity and that they are suitable for very large-scale, practical applications. We provide a reference implementation in ASPECT, an open source code for geodynamic mantle-convection simulations built on the deal.II library.

Numerical Analysis Computational Physics

A Conditional Generative Model for Predicting Material Microstructures from Processing Methods

91 - Akshay Iyer , Biswadip Dey , Arindam Dasgupta 2019

Microstructures of a material form the bridge linking processing conditions - which can be controlled, to the material property - which is the primary interest in engineering applications. Thus a critical task in material design is establishing the processing-structure relationship, which requires domain expertise and techniques that can model the high-dimensional material microstructure. This work proposes a deep learning based approach that models the processing-structure relationship as a conditional image synthesis problem. In particular, we develop an auxiliary classifier Wasserstein GAN with gradient penalty (ACWGAN-GP) to synthesize microstructures under a given processing condition. This approach is free of feature engineering, requires modest domain knowledge and is applicable to a wide range of material systems. We demonstrate this approach using the ultra high carbon steel (UHCS) database, where each microstructure is annotated with a label describing the cooling method it was subjected to. Our results show that ACWGAN-GP can synthesize high-quality multiphase microstructures for a given cooling method.

Image and Video Processing Materials Science Machine Learning

Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models

241 - Dheevatsa Mudigere , Yuchen Hao , Jianyu Huang 2021

Deep learning recommendation models (DLRMs) are used across many business-critical services at Facebook and are the single largest AI application in terms of infrastructure demand in its data-centers. In this paper we discuss the SW/HW co-designed solution for high-performance distributed training of large-scale DLRMs. We introduce a high-performance scalable software stack based on PyTorch and pair it with the new evolution of Zion platform, namely ZionEX. We demonstrate the capability to train very large DLRMs with up to 12 Trillion parameters and show that we can attain 40X speedup in terms of time to solution over previous systems. We achieve this by (i) designing the ZionEX platform with dedicated scale-out network, provisioned with high bandwidth, optimal topology and efficient transport (ii) implementing an optimized PyTorch-based training stack supporting both model and data parallelism (iii) developing sharding algorithms capable of hierarchical partitioning of the embedding tables along row, column dimensions and load balancing them across multiple workers; (iv) adding high-performance core operators while retaining flexibility to support optimizers with fully deterministic updates (v) leveraging reduced precision communications, multi-level memory hierarchy (HBM+DDR+SSD) and pipelining. Furthermore, we develop and briefly comment on distributed data ingestion and other supporting services that are required for the robust and efficient end-to-end training in production environments.

Distributed Parallel and Cluster Computing Artificial Intelligence Machine Learning