No Arabic abstract
We have developed a new parallel tree method which will be called the forest method hereafter. This new method uses the sectional Voronoi tessellation (SVT) for the domain decomposition. The SVT decomposes a whole space into polyhedra and allows their flat borders to move by assigning different weights. The forest method determines these weights based on the load balancing among processors by means of the over-load diffusion (OLD). Moreover, since all the borders are flat, before receiving the data from other processors, each processor can collect enough data to calculate the gravity force with precision. Both the SVT and the OLD are coded in a highly vectorizable manner to accommodate on vector parallel processors. The parallel code based on the forest method with the Message Passing Interface is run on various platforms so that a wide portability is guaranteed. Extensive calculations with 15 processors of Fujitsu VPP300/16R indicate that the code can calculate the gravity force exerted on 10^5 particles in each second for some ideal dark halo. This code is found to enable an N-body simulation with 10^7 or more particles for a wide dynamic range and is therefore a very powerful tool for the study of galaxy formation and large-scale structure in the universe.
We present a new open source code for massive parallel computation of Voronoi tessellations(VT hereafter) in large data sets. The code is focused for astrophysical purposes where VT densities and neighbors are widely used. There are several serial Voronoi tessellation codes, however no open source and parallel implementations are available to handle the large number of particles/galaxies in current N-body simulations and sky surveys. Parallelization is implemented under MPI and VT using Qhull library. Domain decomposition takes into account consistent boundary computation between tasks, and includes periodic conditions. In addition, the code computes neighbors list, Voronoi density, Voronoi cell volume, density gradient for each particle, and densities on a regular grid.
The typical cell of a Voronoi tessellation generated by $n+1$ uniformly distributed random points on the $d$-dimensional unit sphere $mathbb S^d$ is studied. Its $f$-vector is identified in distribution with the $f$-vector of a beta polytope generated by $n$ random points in $mathbb R^d$. Explicit formulae for the expected $f$-vector are provided for any $d$ and the low-dimensional cases $din{2,3,4}$ are studied separately. This implies an explicit formula for the total number of $k$-dimensional faces in the spherical Voronoi tessellation as well.
In this work, we present an adaptive unfitted finite element scheme that combines the aggregated finite element method with parallel adaptive mesh refinement. We introduce a novel scalable distributed-memory implementation of the resulting scheme on locally-adapted Cartesian forest-of-trees meshes. We propose a two-step algorithm to construct the finite element space at hand by means of a discrete extension operator that carefully mixes aggregation constraints of problematic degrees of freedom, which get rid of the small cut cell problem, and standard hanging degree of freedom constraints, which ensure trace continuity on non-conforming meshes. Following this approach, we derive a finite element space that can be expressed as the original one plus well-defined linear constraints. Moreover, it requires minimum parallelization effort, using standard functionality available in existing large-scale finite element codes. Numerical experiments demonstrate its optimal mesh adaptation capability, robustness to cut location and parallel efficiency, on classical Poisson $hp$-adaptivity benchmarks. Our work opens the path to functional and geometrical error-driven dynamic mesh adaptation with the aggregated finite element method in large-scale realistic scenarios. Likewise, it can offer guidance for bridging other scalable unfitted methods and parallel adaptive mesh refinement.
We have developed a highly-tuned software library that accelerates the calculation of quadrupole terms in the Barnes-Hut tree code by use of a SIMD instruction set on the x86 architecture, Advanced Vector eXtensions 2 (AVX2). Our code is implemented as an extension of Phantom-GRAPE software library (Tanikawa et al. 2012, 2013) that significantly accelerates the calculation of monopole terms. If the same accuracy is required, the calculation of quadrupole terms can accelerate the evaluation of forces than that of only monopole terms because we can approximate gravitational forces from closer particles by quadrupole moments than by only monopole moments. Our implementation can calculate gravitational forces about 1.1 times faster in any system than the combination of the pseudoparticle multipole method and Phantom-GRAPE. Our implementation allows simulating homogeneous systems up to 2.2 times faster than that with only monopole terms, however, speed up for clustered systems is not enough because the increase of approximated interactions is insufficient to negate the increased calculation cost by computing quadrupole terms. We have estimated that improvement in performance can be achieved by the use of a new SIMD instruction set, AVX-512. Our code is expected to be able to accelerate simulations of clustered systems up to 1.08 times faster on AVX-512 environment than that with only monopole terms.
A homogeneous Poisson-Voronoi tessellation of intensity $gamma$ is observed in a convex body $W$. We associate to each cell of the tessellation two characteristic radii: the inradius, i.e. the radius of the largest ball centered at the nucleus and included in the cell, and the circumscribed radius, i.e. the radius of the smallest ball centered at the nucleus and containing the cell. We investigate the maximum and minimum of these two radii over all cells with nucleus in $W$. We prove that when $gammarightarrowinfty$, these four quantities converge to Gumbel or Weibull distributions up to a rescaling. Moreover, the contribution of boundary cells is shown to be negligible. Such approach is motivated by the analysis of the global regularity of the tessellation. In particular, consequences of our study include the convergence to the simplex shape of the cell with smallest circumscribed radius and an upper-bound for the Hausdorff distance between $W$ and its so-called Poisson-Voronoi approximation.