Deep learning for peptide identification from metaproteomics datasets

75 0 0.0 ( 0 )

Download Cite

Added by Xuan Guo

Publication date 2020

fields Biology

and research's language is English

Authors Xuan Guo - Shichao Feng

Quantitative Methods

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Metaproteomics are becoming widely used in microbiome research for gaining insights into the functional state of the microbial community. Current metaproteomics studies are generally based on high-throughput tandem mass spectrometry (MS/MS) coupled with liquid chromatography. The identification of peptides and proteins from MS data involves the computational procedure of searching MS/MS spectra against a predefined protein sequence database and assigning top-scored peptides to spectra. Existing computational tools are still far from being able to extract all the information out of large MS/MS datasets acquired from metaproteome samples. In this paper, we proposed a deep-learning-based algorithm, called DeepFilter, for improving the rate of confident peptide identifications from a collection of tandem mass spectra. Compared with other post-processing tools, including Percolator, Q-ranker, PeptideProphet, and Iprophet, DeepFilter identified 20% and 10% more peptide-spectrum-matches and proteins, respectively, on marine microbial and soil microbial metaproteome samples with false discovery rate at 1%.

rate research

RAId DbS: A Mass-Spectrometry Based Peptide Identification Web Server with Knowledge Integration

355 - Gelio Alves , Aleksey Ogurtsov , 2008

Summary: In anticipation of the individualized proteomics era and the need to integrate knowledge from disease studies, we have augmented our peptide identification software RAId DbS to take into account annotated single amino acid polymorphisms, post-translational modifications, and their documented disease associations while analyzing a tandem mass spectrum. To facilitate new discoveries, RAId DbS allows users to conduct searches permitting novel polymorphisms. Availability: The webserver link is http://www.ncbi.nlm.nih.gov/ /CBBResearch/qmbp/raid dbs/index.html. The relevant databases and binaries of RAId DbS for Linux, Windows, and Mac OS X are available from the same web page. Contact: [email protected]

Quantitative Methods

A Pilot Study For Fragment Identification Using 2D NMR and Deep Learning

72 - Stefan Kuhn , Eda Tumer , Simon Colreavy-Donnelly 2021

This paper presents a method to identify substructures in NMR spectra of mixtures, specifically 2D spectra, using a bespoke image-based Convolutional Neural Network application. This is done using HSQC and HMBC spectra separately and in combination. The application can reliably detect substructures in pure compounds, using a simple network. It can work for mixtures when trained on pure compounds only. HMBC data and the combination of HMBC and HSQC show better results than HSQC alone.

Quantitative Methods Artificial Intelligence Machine Learning

Fast deep learning correspondence for neuron tracking and identification in C.elegans using synthetic training

264 - Xinwei Yu , Matthew S. Creamer , Francesco Randi 2021

We present an automated method to track and identify neurons in C. elegans, called fast Deep Learning Correspondence or fDLC, based on the transformer network architecture. The model is trained once on empirically derived synthetic data and then predicts neural correspondence across held-out real animals via transfer learning. The same pre-trained model both tracks neurons across time and identifies corresponding neurons across individuals. Performance is evaluated against hand-annotated datasets, including NeuroPAL [1]. Using only position information, the method achieves 80.0% accuracy at tracking neurons within an individual and 65.8% accuracy at identifying neurons across individuals. Accuracy is even higher on a published dataset [2]. Accuracy reaches 76.5% when using color information from NeuroPAL. Unlike previous methods, fDLC does not require straightening or transforming the animal into a canonical coordinate system. The method is fast and predicts correspondence in 10 ms making it suitable for future real-time applications.

Quantitative Methods Computer Vision and Pattern Recognition Neurons and Cognition

Learning Deep Models from Synthetic Data for Extracting Dolphin Whistle Contours

119 - Pu Li , Xiaobai Liua , K. J. Palmer 2020

We present a learning-based method for extracting whistles of toothed whales (Odontoceti) in hydrophone recordings. Our method represents audio signals as time-frequency spectrograms and decomposes each spectrogram into a set of time-frequency patches. A deep neural network learns archetypical patterns (e.g., crossings, frequency modulated sweeps) from the spectrogram patches and predicts time-frequency peaks that are associated with whistles. We also developed a comprehensive method to synthesize training samples from background environments and train the network with minimal human annotation effort. We applied the proposed learn-from-synthesis method to a subset of the public Detection, Classification, Localization, and Density Estimation (DCLDE) 2011 workshop data to extract whistle confidence maps, which we then processed with an existing contour extractor to produce whistle annotations. The F1-score of our best synthesis method was 0.158 greater than our baseline whistle extraction algorithm (~25% improvement) when applied to common dolphin (Delphinus spp.) and bottlenose dolphin (Tursiops truncatus) whistles.

Quantitative Methods Sound Audio and Speech Processing

Deep Learning for Identifying Metastatic Breast Cancer

85 - Dayong Wang , Aditya Khosla , Rishab Gargeya 2016

The International Symposium on Biomedical Imaging (ISBI) held a grand challenge to evaluate computational systems for the automated detection of metastatic breast cancer in whole slide images of sentinel lymph node biopsies. Our team won both competitions in the grand challenge, obtaining an area under the receiver operating curve (AUC) of 0.925 for the task of whole slide image classification and a score of 0.7051 for the tumor localization task. A pathologist independently reviewed the same images, obtaining a whole slide image classification AUC of 0.966 and a tumor localization score of 0.733. Combining our deep learning systems predictions with the human pathologists diagnoses increased the pathologists AUC to 0.995, representing an approximately 85 percent reduction in human error rate. These results demonstrate the power of using deep learning to produce significant improvements in the accuracy of pathological diagnoses.

Quantitative Methods Computer Vision and Pattern Recognition