MESSAR: Automated recommendation of metabolite substructures from tandem mass spectra
Autoři:
Youzhong Liu aff001; Aida Mrzic aff001; Pieter Meysman aff001; Thomas De Vijlder aff003; Edwin P. Romijn aff003; Dirk Valkenborg aff004; Wout Bittremieux aff001; Kris Laukens aff001
Působiště autorů:
Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
aff001; Biomedical Informatics Network Antwerpen (biomina), University of Antwerp, Antwerp, Belgium
aff002; Pharmaceutical Development & Manufacturing Sciences (PDMS), Janssen Research & Development, Beerse, Belgium
aff003; Interuniversity Institute for Biostatistics and Statistical Bioinformatics, Hasselt University, Diepenbeek, Belgium
aff004; Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, San Diego, CA, United States of America
aff005
Vyšlo v časopise:
PLoS ONE 15(1)
Kategorie:
Research Article
doi:
https://doi.org/10.1371/journal.pone.0226770
Souhrn
Despite the increasing importance of non-targeted metabolomics to answer various life science questions, extracting biochemically relevant information from metabolomics spectral data is still an incompletely solved problem. Most computational tools to identify tandem mass spectra focus on a limited set of molecules of interest. However, such tools are typically constrained by the availability of reference spectra or molecular databases, limiting their applicability of generating structural hypotheses for unknown metabolites. In contrast, recent advances in the field illustrate the possibility to expose the underlying biochemistry without relying on metabolite identification, in particular via substructure prediction. We describe an automated method for substructure recommendation motivated by association rule mining. Our framework captures potential relationships between spectral features and substructures learned from public spectral libraries. These associations are used to recommend substructures for any unknown mass spectrum. Our method does not require any predefined metabolite candidates, and therefore it can be used for the hypothesis generation or partial identification of unknown unknowns. The method is called MESSAR (MEtabolite SubStructure Auto-Recommender) and is implemented in a free online web service available at messar.biodatamining.be.
Klíčová slova:
Drug metabolism – Machine learning algorithms – Mass spectra – Metabolic networks – Metabolites – Metabolomics – Molecular structure – Statistical data
Zdroje
1. Wishart DS. Emerging applications of metabolomics in drug discovery and precision medicine. Nature Reviews Drug Discovery. 2016;15(7):473–484. doi: 10.1038/nrd.2016.32 26965202
2. Armitage EG, Barbas C. Metabolomics in cancer biomarker discovery: Current trends and future perspectives. Journal of Pharmaceutical and Biomedical Analysis. 2014;87:1—11. doi: 10.1016/j.jpba.2013.08.041 24091079
3. Patti GJ, Yanes O, Siuzdak G. Innovation: Metabolomics: the apogee of the omics trilogy. Nature Reviews Molecular Cell Biology. 2012;13(4):263–269. doi: 10.1038/nrm3314 22436749
4. Nguyen DH, Nguyen CH, Mamitsuka H. Recent advances and prospects of computational methods for metabolite identification: a review with emphasis on machine learning approaches. Briefings in Bioinformatics.
5. Kim S, Thiessen PA, Bolton E, Chen J, Fu G, Gindulyte A, et al. PubChem Substance and Compound databases. Nucleic Acids Res. 2016;44(D1):D1202–D1213. doi: 10.1093/nar/gkv951 26400175
6. Wolf S, Schmidt S, Müller-Hannemann M, Neumann S. In silico fragmentation for computer assisted identification of metabolite mass spectra. BMC Bioinformatics 2010. 2010;148:11.
7. Ridder L, van der Hooft JJJ, Verhoeven S, de Vos RC, van Schaik R, Vervoort J. Substructure-based annotation of high-resolution multistage MS(n) spectral trees. Rapid Communication in Mass Spectrometry. 2012;26(20):2461–2471. doi: 10.1002/rcm.6364
8. Tsugawa H, Kind T, Nakabayashi R, Yukihira D, Tanaka W, Cajka T, et al. Hydrogen Rearrangement Rules: Computational MS/MS Fragmentation and Structure Elucidation Using MS-FINDER Software. Analytical Chemistry. 2016;88(16):7946–7958. doi: 10.1021/acs.analchem.6b00770 27419259
9. Allen F, Greiner R, Wishart D. Competitive fragmentation modeling of ESI-MS/MS spectra for putative metabolite identification. Metabolomics. 2015;11(1):98–110. doi: 10.1007/s11306-014-0676-4
10. Cereto-Massagué A, Ojeda MJ, Valls C, Mulero M, Garcia-Vallvé S, Pujadas G. Molecular fingerprint similarity search in virtual screening. Methods. 2015;71:58–63. doi: 10.1016/j.ymeth.2014.08.005 25132639
11. Dührkop K, Shen H, Meusel M, Rousu J, Böcker S. Searching molecular structure databases with tandem mass spectra using CSI:FingerID. Proceedings of the National Academy of Sciences of the United States of America. 2015;112(41):12580–12585. doi: 10.1073/pnas.1509788112 26392543
12. Demarque DP, Crotti AEM, Vessecchi R, Lopes JLC, Lopes NP. Fragmentation reactions using electrospray ionization mass spectrometry: an important tool for the structural elucidation and characterization of synthetic and natural products. Natural Product Reports. 2016;33(3):432–455. doi: 10.1039/c5np00073d 26673733
13. Blaženović I, Kind T, Torbašinović H, Obrenović S, Mehta SS, Tsugawa H, et al. Comprehensive comparison of in silico MS/MS fragmentation tools of the CASMI contest: database boosting is needed to achieve 93% accuracy. Journal of Cheminformatics. 2017;9. doi: 10.1186/s13321-017-0219-x 29086039
14. Yang JY, Sanchez LM, Rath CM, Liu X, Boudreau PD, Bruns N, et al. Molecular Networking as a Dereplication Strategy. Journal of Natural Products. 2013;76(9):1686–1699. doi: 10.1021/np400413s 24025162
15. Aguilar-Mogas A, Sales-Pardo M, Navarro M, Guimerá R, Yanes O. iMet: A Network-Based Computational Tool To Assist in the Annotation of Metabolites from Tandem Mass Spectra. Analytical Chemistry. 2017;86(6):3474–3482. doi: 10.1021/acs.analchem.6b04512
16. Mrzic A, Lermyte F, Vu TN, Valkenborg D, Laukens K. InSourcerer: a high-throughput method to search for unknown metabolite modifications by mass spectrometry. Rapid Communications in Mass Spectrometry. 2017;31(17):1396–1404. doi: 10.1002/rcm.7910 28569011
17. Mahieu NG, Spalding JL, Gelman SJ, Patti GJ. Defining and Detecting Complex Peak Relationships in Mass Spectral Data: The Mz.unity Algorithm. Analytical Chemistry. 2016;88(18):9037–9046. doi: 10.1021/acs.analchem.6b01702 27513885
18. Wang M, Carver JJ, Phelan VV, Sanchez LM, Garg N, Peng Y, et al. Sharing and community curation of mass spectrometry data with GNPS. Nature biotechnology. 2016;34(8):828–837. 27504778
19. van der Hooft JJJ, Wandy J, Barrett MP, Burgess KEV, Rogers S. Topic modeling for untargeted substructure exploration in metabolomics. Proceedings of the National Academy of Sciences of the United States of America. 2016;113(48):13738–13743. doi: 10.1073/pnas.1608041113 27856765
20. van der Hooft JJJ, Wandy J, Young F, Padmanabhan S, Gerasimidis K, Burgess KEV, et al. Unsupervised discovery and comparison of structural families across multiple samples in untargeted metabolomics. Analytical Chemistry;in press.
21. Wandy J, Zhu Y, van der Hooft JJJ, Daly R, Barrett MP, Rogers S. Ms2lda. org: web-based topic modelling for substructure discovery in mass spectrometry. Bioinformatics;34(2):317–318. doi: 10.1093/bioinformatics/btx582
22. Naulaerts S, P M, Bittremieux W, Vu TN, Vanden Berghe W, Goethals B, et al. A primer to frequent itemset mining for bioinformatics. Briefings in Bioinformatics. 2015;16(2):216–231. doi: 10.1093/bib/bbt074 24162173
23. Vu TN, Bittremieux W, Valkenborg D, Goethals B, Lemière F, Laukens K. Efficient Reduction of Candidate Matches in Peptide Spectrum Library Searching Using the Top k Most Intense Peaks. Journal of Proteome Research. 2014;13(9):4175–4183. doi: 10.1021/pr401269z 25004400
24. Vu TN, Mrzic A, Valkenborg D, Maes E, Lemière F, Goethals B, et al. Unravelling associations between unassigned mass spectrometry peaks with frequent itemset mining techniques. Proteome science. 2014;12(1):54. doi: 10.1186/s12953-014-0054-1 25429250
25. Scheubert K, Hufsky F, Petras D, Wang M, Nothias LF, Dührkop K, et al. Significance estimation for large scale metabolomics annotations by spectral matching. Nature Communications. 2017;8(1):1494. doi: 10.1038/s41467-017-01318-5 29133785
26. Degen J, Wegscheid Gerlach C, Zaliani A, Rarey M. On the Art of Compiling and Using ‘Drug-Like’ Chemical Fragment Spaces. ChemMedChem. 2008;3(10):1503–1507. doi: 10.1002/cmdc.200800178 18792903
27. Käll L, Storey JD, MacCoss MJ, Noble WS. Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. Journal of Proteome Research. 2008;7(1):29–34. doi: 10.1021/pr700600n 18067246
28. Chen X, Reynolds CH. Performance of Similarity Measures in 2D Fragment-Based Similarity Searching: Comparison of Structural Descriptors and Similarity Coefficients. Journal of Chemical Information and Computer Sciences. 2002;11.
Článek vyšel v časopise
PLOS One
2020 Číslo 1
- Jak a kdy u celiakie začíná reakce na lepek? Možnou odpověď poodkryla čerstvá kanadská studie
- Pomůže v budoucnu s triáží na pohotovostech umělá inteligence?
- Spermie, vajíčka a mozky – „jednohubky“ z výzkumu 2024/38
- Infekce se v Americe po příjezdu Kolumba šířily nesrovnatelně déle, než se traduje
- Metamizol jako analgetikum první volby: kdy, pro koho, jak a proč?
Nejčtenější v tomto čísle
- Severity of misophonia symptoms is associated with worse cognitive control when exposed to misophonia trigger sounds
- Chemical analysis of snus products from the United States and northern Europe
- Calcium dobesilate reduces VEGF signaling by interfering with heparan sulfate binding site and protects from vascular complications in diabetic mice
- Effect of Lactobacillus acidophilus D2/CSL (CECT 4529) supplementation in drinking water on chicken crop and caeca microbiome
Zvyšte si kvalifikaci online z pohodlí domova
Všechny kurzy