Rapid, non-destructive characterization of molecular level chemistry for organic matter (OM) is experimentally challenging. Raman spectroscopy is one of the most widely used techniques for non-destructive chemical characterization, although it currently does not provide detailed identification of molecular components in OM, due to the combination of diffraction-limited spatial resolution and poor applicability of peak-fitting algorithms. Here, we develop a genome-inspired collective molecular structure fingerprinting approach, which utilizes ab initio calculations and data mining techniques to extract molecular level chemistry from the Raman spectra of OM. We illustrate the power of such an approach by identifying representative molecular fingerprints in OM, for which the molecular chemistry is to date inaccessible using non-destructive characterization techniques. Chemical properties such as aromatic cluster size distribution and H/C ratio can now be quantified directly using the identified molecular fingerprints. Our approach will enable non-destructive identification of chemical signatures with their correlation to the preservation of biosignatures in OM, accurate detection and quantification of environmental contamination, as well as objective assessment of OM with respect to their chemical contents.