Oral Presentation Royal Australian Chemical Institute National Congress 2026

Predictive mass spectrometry from quantum-mechanical fragmentation and intensity modelling (137071)

Victor Posligua 1 , Jeremy Tan 2 , Stella Sing Yee Lee 2 , Yihui Chong 2 , Bingquan Shen 2 , Stephen G Dale 1
  1. National University of Singapore, Singapore, SINGAPORE
  2. DSO National Laboratories, Singapore

Machine learning methods have been widely explored for chemical property prediction, including electron-impact mass spectrum (EI-MS) prediction (e.g., NEIMS[1]), but their reliability often remains limited by the availability, coverage and consistency of experimental training data [2,3]. Mass spectrometry is a comparatively data rich exception, with curated EI reference libraries containing on the order of 105–106 spectra; for example, the NIST 23 EI library reports ~394K EI spectra (with replicate spectra for some compounds)[4]. However, even these resources fall short of the enormous data requirements to train AI methods and leaves gaps across chemical space, motivating complementary data sources that can generate controlled and traceable spectra. This talk will present an end-to-end quantum mechanics (QM) pipeline that predicts full m/z-intensity spectra directly from SMILES. The approach is energy-driven: fragmentation channels are ranked and filtered using QM energetics and peak intensities are assigned using simple global rules to enable transparent benchmarking and failure mode analysis. We evaluate on alcohol reference series and a subset obtained from the Critical Assessment of Small Molecule Identification (CASMI) [5], and include QCxMS2 [6] as a state-of-the-art literature comparison.

[1] Wei, J.N. et al., Rapid Prediction of Electron–Ionization Mass Spectrometry Using Neural Networks. ACS Cent. Sci. 5, 700-708, 2019.

[2] Zhu, R.L. and Jonas, E. Rapid Approximate Subset-Based Spectra Prediction for Electron Ionization–Mass Spectrometry. Anal. Chem., 95, 2653-2663, 2023.

[3] Malarvannan, M., et al. Assessment of computational approaches in the prediction of spectrogram and chromatogram behaviours of analytes in pharmaceutical analysis: assessment review. Futur. J. Pharm. Sci., 9, 86, 2023.

[4] Scientific Instrument Services (SIS). NIST 23 Mass Spectral Library, NIST 2023/2020/2017 Database, Agilent Format Available. https://www.sisweb.com/software/ms/nist.htm (Accessed 2026-01-15).

[5] Vaniya, A. and Feihn, O. Revisiting CASMI: Compound ID for 500 New Unknowns, Using LC/MS/MS Data, 2022.

[6] Gorges, J. and S. Grimme, QCxMS2 – a program for the calculation of electron ionization mass spectra via automated reaction network discovery. Phys. Chem. Chem. Phys., 27, 6899–6911, 2025.