No Arabic abstract
We outline the development of a general-purpose Python-based data analysis tool for OpenFOAM. Our implementation relies on the construction of OpenFOAM applications that have bindings to data analysis libraries in Python. Double precision data in OpenFOAM is cast to a NumPy array using the NumPy C-API and Python modules may then be used for arbitrary data analysis and manipulation on flow-field information. We highlight how the proposed wrapper may be used for an in-situ online singular value decomposition (SVD) implemented in Python and accessed from the OpenFOAM solver PimpleFOAM. Here, `in-situ refers to a programming paradigm that allows for a concurrent computation of the data analysis on the same computational resources utilized for the partial differential equation solver. In addition, to demonstrate data-parallel analyses, we deploy a distributed SVD, which collects snapshot data across the ranks of a distributed simulation to compute the global left singular vectors. Crucially, both OpenFOAM and Python share the same message passing interface (MPI) communicator for this deployment which allows Python objects and functions to exchange NumPy arrays across ranks. Subsequently, we provide scaling assessments of this distributed SVD on multiple nodes of Intel Broadwell and KNL architectures for canonical test cases such as the large eddy simulations of a backward facing step and a channel flow at friction Reynolds number of 395. Finally, we demonstrate the deployment of a deep neural network for compressing the flow-field information using an autoencoder to demonstrate an ability to use state-of-the-art machine learning tools in the Python ecosystem.
We outline the development of a data science module within OpenFOAM which allows for the in-situ deployment of trained deep learning architectures for general-purpose predictive tasks. This module is constructed with the TensorFlow C API and is integrated into OpenFOAM as an application that may be linked at run time. Notably, our formulation precludes any restrictions related to the type of neural network architecture (i.e., convolutional, fully-connected, etc.). This allows for potential studies of complicated neural architectures for practical CFD problems. In addition, the proposed module outlines a path towards an open-source, unified and transparent framework for computational fluid dynamics and machine learning.
Python has become the de facto language for scientific computing. Programming in Python is highly productive, mainly due to its rich science-oriented software ecosystem built around the NumPy module. As a result, the demand for Python support in High Performance Computing (HPC) has skyrocketed. However, the Python language itself does not necessarily offer high performance. In this work, we present a workflow that retains Pythons high productivity while achieving portable performance across different architectures. The workflows key features are HPC-oriented language extensions and a set of automatic optimizations powered by a data-centric intermediate representation. We show performance results and scaling across CPU, GPU, FPGA, and the Piz Daint supercomputer (up to 23,328 cores), with 2.47x and 3.75x speedups over previous-best solutions, first-ever Xilinx and Intel FPGA results of annotated Python, and up to 93.16% scaling efficiency on 512 nodes.
There are numerous approaches to building analysis applications across the high-energy physics community. Among them are Python-based, or at least Python-driven, analysis workflows. We aim to ease the adoption of a Python-based analysis toolkit by making it easier for non-expert users to gain access to Python tools for scientific analysis. Experimental software distributions and individual user analysis have quite different requirements. Distributions tend to worry most about stability, usability and reproducibility, while the users usually strive to be fast and nimble. We discuss how we built and now maintain a python distribution for analysis while satisfying requirements both a large software distribution (in our case, that of CMSSW) and user, or laptop, level analysis. We pursued the integration of tools used by the broader data science community as well as HEP developed (e.g., histogrammar, root_numpy) Python packages. We discuss concepts we investigated for package integration and testing, as well as issues we encountered through this process. Distribution and platform support are important topics. We discuss our approach and progress towards a sustainable infrastructure for supporting this Python stack for the CMS user community and for the broader HEP user community.
Soft particles at fluid interfaces play an important role in many aspects of our daily life, such as the food industry, paints and coatings, and medical applications. Analytical methods are not capable of describing the emergent effects of the complex dynamics of suspensions of many soft particles, whereas experiments typically either only capture bulk properties or require invasive methods. Computational methods are therefore a great tool to complement experimental work. However, an efficient and versatile numerical method is needed to model dense suspensions of many soft particles. In this article we propose a method to simulate soft particles in a multi-component fluid, both at and near fluid-fluid interfaces, based on the lattice Boltzmann method, and characterize the error stemming from the fluid-structure coupling for the particle equilibrium shape when adsorbed onto a fluid-fluid interface. Furthermore, we characterize the influence of the preferential contact angle of the particle surface and the particle softness on the vertical displacement of the center of mass relative to the fluid interface. Finally, we demonstrate the capability of our model by simulating a soft capsule adsorbing onto a fluid-fluid interface with a shear flow parallel to the interface, and the covering of a droplet suspended in another fluid by soft particles with different wettability.
Visualizing regional-scale landslides is the key to conveying the threat of natural hazards to stakeholders and policymakers. Traditional visualization techniques are restricted to post-processing a limited subset of simulation data and are not scalable to rendering exascale models with billions of particles. In-situ visualization is a technique of rendering simulation data in real-time, i.e., rendering visuals in tandem while the simulation is running. In this study, we develop a scalable N:M interface architecture to visualize regional-scale landslides. We demonstrate the scalability of the architecture by simulating the long runout of the 2014 Oso landslide using the Material Point Method coupled with the Galaxy ray tracing engine rendering 4.2 million material points as spheres. In-situ visualization has an amortized runtime increase of 2% compared to non-visualized simulations. The developed approach can achieve in-situ visualization of regional-scale landslides with billions of particles with minimal impact on the simulation process.