No Arabic abstract
We present the Multiscale Coupling Library and Environment: MUSCLE 2. This multiscale component-based execution environment has a simple to use Java, C++, C, Python and Fortran API, compatible with MPI, OpenMP and threading codes. We demonstrate its local and distributed computing capabilities and compare its performance to MUSCLE 1, file copy, MPI, MPWide, and GridFTP. The local throughput of MPI is about two times higher, so very tightly coupled code should use MPI as a single submodel of MUSCLE 2; the distributed performance of GridFTP is lower, especially for small messages. We test the performance of a canal system model with MUSCLE 2, where it introduces an overhead as small as 5% compared to MPI.
Master-worker distributed computing systems use task replication in order to mitigate the effect of slow workers, known as stragglers. Tasks are grouped into batches and assigned to one or more workers for execution. We first consider the case when the batches do not overlap and, using the results from majorization theory, show that, for a general class of workers service time distributions, a balanced assignment of batches to workers minimizes the average job compute time. We next show that this balanced assignment of non-overlapping batches achieves lower average job compute time compared to the overlapping schemes proposed in the literature. Furthermore, we derive the optimum redundancy level as a function of the service time distribution at workers. We show that the redundancy level that minimizes average job compute time is not necessarily the same as the redundancy level that maximizes the predictability of job compute time, and thus there exists a trade-off between optimizing the two metrics. Finally, by running experiments on Google cluster traces, we observe that redundancy can reduce the compute time of the jobs in Google clusters by an order of magnitude, and that the optimum level of redundancy depends on the distribution of tasks service time.
Physical processes influencing the properties of galaxies can be traced by the dependence and evolution of galaxy properties on their environment. A detailed understanding of this dependence can only be gained through comparison of observations with models, with an appropriate quantification of the rich parameter space describing the environment of the galaxy. We present a new, multiscale parameterization of galaxy environment which retains an observationally motivated simplicity whilst utilizing the information present on different scales. We examine how the distribution of galaxy (u-r) colours in the Sloan Digital Sky Survey (SDSS), parameterized using a double gaussian (red plus blue peak) fit, depends upon multiscale density. This allows us to probe the detailed dependence of galaxy properties on environment in a way which is independent of the halo model. Nonetheless, cross-correlation with the group catalogue constructed by Yang et al, 2007 shows that galaxy properties trace environment on different scales in a way which mimics that expected within the halo model. This provides independent support for the existence of virialized haloes, and important additional clues to the role played by environment in the evolution of the galaxy population. This work is described in full by Wilman et al., 2010, MNRAS, accepted
The nature of dark energy and the complete theory of gravity are two central questions currently facing cosmology. A vital tool for addressing them is the 3-point correlation function (3PCF), which probes deviations from a spatially random distribution of galaxies. However, the 3PCFs formidable computational expense has prevented its application to astronomical surveys comprising millions to billions of galaxies. We present Galactos, a high-performance implementation of a novel, O(N^2) algorithm that uses a load-balanced k-d tree and spherical harmonic expansions to compute the anisotropic 3PCF. Our implementation is optimized for the Intel Xeon Phi architecture, exploiting SIMD parallelism, instruction and thread concurrency, and significant L1 and L2 cache reuse, reaching 39% of peak performance on a single node. Galactos scales to the full Cori system, achieving 9.8PF (peak) and 5.06PF (sustained) across 9636 nodes, making the 3PCF easily computable for all galaxies in the observable universe.
Multiscale models allow for the treatment of complex phenomena involving different scales, such as remodeling and growth of tissues, muscular activation, and cardiac electrophysiology. Numerous numerical approaches have been developed to simulate multiscale problems. However, compared to the well-established methods for classical problems, many questions have yet to be answered. Here, we give an overview of existing models and methods, with particular emphasis on mechanical and bio-mechanical applications. Moreover, we discuss state-of-the-art techniques for multilevel and multifidelity uncertainty quantification. In particular, we focus on the similarities that can be found across multiscale models, discretizations, solvers, and statistical methods for uncertainty quantification. Similarly to the current trend of removing the segregation between discretizations and solution methods in scientific computing, we anticipate that the future of multiscale simulation will provide a closer interaction with also the models and the statistical methods. This will yield better strategies for transferring the information across different scales and for a more seamless transition in selecting and adapting the level of details in the models. Finally, we note that machine learning and Bayesian techniques have shown a promising capability to capture complex model dependencies and enrich the results with statistical information; therefore, they can complement traditional physics-based and numerical analysis approaches.
Clustering algorithms partition a dataset into groups of similar points. The clustering problem is very general, and different partitions of the same dataset could be considered correct and useful. To fully understand such data, it must be considered at a variety of scales, ranging from coarse to fine. We introduce the Multiscale Environment for Learning by Diffusion (MELD) data model, which is a family of clusterings parameterized by nonlinear diffusion on the dataset. We show that the MELD data model precisely captures latent multiscale structure in data and facilitates its analysis. To efficiently learn the multiscale structure observed in many real datasets, we introduce the Multiscale Learning by Unsupervised Nonlinear Diffusion (M-LUND) clustering algorithm, which is derived from a diffusion process at a range of temporal scales. We provide theoretical guarantees for the algorithms performance and establish its computational efficiency. Finally, we show that the M-LUND clustering algorithm detects the latent structure in a range of synthetic and real datasets.