No Arabic abstract
Two-dimensional (2D) materials have been a hot research topic in the last decade, due to novel fundamental physics in the reduced dimension and appealing applications. Systematic discovery of functional 2D materials has been the focus of many studies. Here, we present a large dataset of 2D materials, with more than 6,000 monolayer structures, obtained from both top-down and bottom-up discovery procedures. First, we screened all bulk materials in the database of Materials Project for layered structures by a topology-based algorithm, and theoretically exfoliate them into monolayers. Then, we generated new 2D materials by chemical substitution of elements in known 2D materials by others from the same group in the periodic table. The structural, electronic and energetic properties of these 2D materials are consistently calculated, to provide a starting point for further material screening, data mining, data analysis and artificial intelligence applications. We present the details of computational methodology, data record and technical validation of our publicly available data (http://www.2dmatpedia.org/).
We search for novel two-dimensional materials that can be easily exfoliated from their parent compounds. Starting from 108423 unique, experimentally known three-dimensional compounds we identify a subset of 5619 that appear layered according to robust geometric and bonding criteria. High-throughput calculations using van-der-Waals density-functional theory, validated against experimental structural data and calculated random-phase-approximation binding energies, allow to identify 1825 compounds that are either easily or potentially exfoliable, including all that are commonly exfoliated experimentally. In particular, the subset of 1036 easily exfoliable cases---layered materials held together mostly by dispersion interactions and with binding energies up to $30-35$ meV$cdottext{AA}^{-2}$---provides a wealth of novel structural prototypes and simple ternary compounds, and a large portfolio to search materials for optimal properties. For the 258 compounds with up to 6 atoms per primitive cell we comprehensively explore vibrational, electronic, magnetic, and topological properties, identifying in particular 56 ferromagnetic and antiferromagnetic systems, including half-metals and half-semiconductors.
Biological functions are carried out by groups of interacting molecules, cells or tissues, known as communities. Membership in these communities may overlap when biological components are involved in multiple functions. However, traditional clustering methods detect non-overlapping communities. These detected communities may also be unstable and difficult to replicate, because traditional methods are sensitive to noise and parameter settings. These aspects of traditional clustering methods limit our ability to detect biological communities, and therefore our ability to understand biological functions. To address these limitations and detect robust overlapping biological communities, we propose an unorthodox clustering method called SpeakEasy which identifies communities using top-down and bottom-up approaches simultaneously. Specifically, nodes join communities based on their local connections, as well as global information about the network structure. This method can quantify the stability of each community, automatically identify the number of communities, and quickly cluster networks with hundreds of thousands of nodes. SpeakEasy shows top performance on synthetic clustering benchmarks and accurately identifies meaningful biological communities in a range of datasets, including: gene microarrays, protein interactions, sorted cell populations, electrophysiology and fMRI brain imaging.
The C2DB is a highly curated open database organizing a wealth of computed properties for more than 4000 atomically thin two-dimensional (2D) materials. Here we report on new materials and properties that were added to the database since its first release in 2018. The set of new materials comprise several hundred monolayers exfoliated from experimentally known layered bulk materials, (homo)bilayers in various stacking configurations, native point defects in semiconducting monolayers, and chalcogen/halogen Janus monolayers. The new properties include exfoliation energies, Bader charges, spontaneous polarisations, Born charges, infrared polarisabilities, piezoelectric tensors, band topology invariants, exchange couplings, Raman- and second harmonic generation spectra. We also describe refinements of the employed material classification schemes, upgrades of the computational methodologies used for property evaluations, as well as significant enhancements of the data documentation and provenance. Finally, we explore the performance of Gaussian process-based regression for efficient prediction of mechanical and electronic materials properties. The combination of open access, detailed documentation, and extremely rich materials property data sets make the C2DB a unique resource that will advance the science of atomically thin materials.
We consider the task of learning a classifier for semantic segmentation using weak supervision in the form of image labels which specify the object classes present in the image. Our method uses deep convolutional neural networks (CNNs) and adopts an Expectation-Maximization (EM) based approach. We focus on the following three aspects of EM: (i) initialization; (ii) latent posterior estimation (E-step) and (iii) the parameter update (M-step). We show that saliency and attention maps, our bottom-up and top-down cues respectively, of simple images provide very good cues to learn an initialization for the EM-based algorithm. Intuitively, we show that before trying to learn to segment complex images, it is much easier and highly effective to first learn to segment a set of simple images and then move towards the complex ones. Next, in order to update the parameters, we propose minimizing the combination of the standard softmax loss and the KL divergence between the true latent posterior and the likelihood given by the CNN. We argue that this combination is more robust to wrong predictions made by the expectation step of the EM method. We support this argument with empirical and visual results. Extensive experiments and discussions show that: (i) our method is very simple and intuitive; (ii) requires only image-level labels; and (iii) consistently outperforms other weakly-supervised state-of-the-art methods with a very high margin on the PASCAL VOC 2012 dataset.
Materials Cloud is a platform designed to enable open and seamless sharing of resources for computational science, driven by applications in materials modelling. It hosts 1) archival and dissemination services for raw and curated data, together with their provenance graph, 2) modelling services and virtual machines, 3) tools for data analytics, and pre-/post-processing, and 4) educational materials. Data is citable and archived persistently, providing a comprehensive embodiment of the FAIR principles that extends to computational workflows. Materials Cloud leverages the AiiDA framework to record the provenance of entire simulation pipelines (calculations performed, codes used, data generated) in the form of graphs that allow to retrace and reproduce any computed result. When an AiiDA database is shared on Materials Cloud, peers can browse the interconnected record of simulations, download individual files or the full database, and start their research from the results of the original authors. The infrastructure is agnostic to the specific simulation codes used and can support diverse applications in computational science that transcend its initial materials domain.