Scalable multicomponent spectral analysis for high-throughput data annotation


الملخص بالإنكليزية

Orchestrating parametric fitting of multicomponent spectra at scale is an essential yet underappreciated task in high-throughput quantification of materials and chemical composition. To automate the annotation process for spectroscopic and diffraction data collected in counts of hundreds to thousands, we present a systematic approach compatible with high-performance computing infrastructures using the MapReduce model and task-based parallelization. We implement the approach in software and demonstrate linear computational scaling with respect to spectral components using multidimensional experimental materials characterization datasets from photoemission spectroscopy and powder electron diffraction as benchmarks. Our approach enables efficient generation of high-quality data annotation and online spectral analysis and is applicable to a variety of analytical techniques in materials science and chemistry as a building block for closed-loop experimental systems.

تحميل البحث