ﻻ يوجد ملخص باللغة العربية
Motivation: Assigning statistical significance accurately has become increasingly important as meta data of many types, often assembled in hierarchies, are constructed and combined for further biological analyses. Statistical inaccuracy of meta data at any level may propagate to downstream analyses, undermining the validity of scientific conclusions thus drawn. From the perspective of mass spectrometry based proteomics, even though accurate statistics for peptide identification can now be achieved, accurate protein level statistics remain challenging. Results: We have constructed a protein ID method that combines peptide evidences of a candidate protein based on a rigorous formula derived earlier; in this formula the database $P$-value of every peptide is weighted, prior to the final combination, according to the number of proteins it maps to. We have also shown that this protein ID method provides accurate protein level $E$-value, eliminating the need of using empirical post-processing methods for type-I error control. Using a known protein mixture, we find that this protein ID method, when combined with the Soric formula, yields accurate values for the proportion of false discoveries. In terms of retrieval efficacy, the results from our method are comparable with other methods tested. Availability: The source code, implemented in C++ on a linux system, is available for download at ftp://ftp.ncbi.nlm.nih.gov/pub/qmbp/qmbp_ms/RAId/RAId_Linux_64Bit
Summary: In anticipation of the individualized proteomics era and the need to integrate knowledge from disease studies, we have augmented our peptide identification software RAId DbS to take into account annotated single amino acid polymorphisms, pos
The common techniques to study protein-protein proximity in vivo are not well-adapted to the capabilities and the expertise of a standard proteomics laboratory, typically based on the use of mass spectrometry. With the aim of closing this gap, we hav
Native electrospray ionization/ion mobility-mass spectrometry (ESI/IM-MS) allows an accurate determination of low-resolution structural features of proteins. Yet, the presence of proton dynamics, observed already by us for DNA in the gas phase, and i
Motivation: Accurate estimation of false discovery rate (FDR) of spectral identification is a central problem in mass spectrometry-based proteomics. Over the past two decades, target decoy approaches (TDAs) and decoy-free approaches (DFAs), have been
BACKGROUND: One of the most evident achievements of bioinformatics is the development of methods that transfer biological knowledge from characterised proteins to uncharacterised sequences. This mode of protein function assignment is mostly based on