ترغب بنشر مسار تعليمي؟ اضغط هنا

254 - Gelio Alves , Yi-Kuo Yu 2014
Motivation: Assigning statistical significance accurately has become increasingly important as meta data of many types, often assembled in hierarchies, are constructed and combined for further biological analyses. Statistical inaccuracy of meta data at any level may propagate to downstream analyses, undermining the validity of scientific conclusions thus drawn. From the perspective of mass spectrometry based proteomics, even though accurate statistics for peptide identification can now be achieved, accurate protein level statistics remain challenging. Results: We have constructed a protein ID method that combines peptide evidences of a candidate protein based on a rigorous formula derived earlier; in this formula the database $P$-value of every peptide is weighted, prior to the final combination, according to the number of proteins it maps to. We have also shown that this protein ID method provides accurate protein level $E$-value, eliminating the need of using empirical post-processing methods for type-I error control. Using a known protein mixture, we find that this protein ID method, when combined with the Soric formula, yields accurate values for the proportion of false discoveries. In terms of retrieval efficacy, the results from our method are comparable with other methods tested. Availability: The source code, implemented in C++ on a linux system, is available for download at ftp://ftp.ncbi.nlm.nih.gov/pub/qmbp/qmbp_ms/RAId/RAId_Linux_64Bit
91 - Gelio Alves , Yi-Kuo Yu 2010
Goods formula and Fishers method are frequently used for combining independent P-values. Interestingly, the equivalent of Goods formula already emerged in 1910 and mathematical expressions relevant to even more general situations have been repeatedly derived, albeit in different context. We provide here a novel derivation and show how the analytic formula obtained reduces to the two aforementioned ones as special cases. The main novelty of this paper, however, is the explicit treatment of nearly degenerate weights, which are known to cause numerical instabilities. We derive a controlled expansion, in powers of differences in inverse weights, that provides both accurate statistics and stable numerics.
Statistically meaningful comparison/combination of peptide identification results from various search methods is impeded by the lack of a universal statistical standard. Providing an E-value calibration protocol, we demonstrated earlier the feasibili ty of translating either the score or heuristic E-value reported by any method into the textbook-defined E-value, which may serve as the universal statistical standard. This protocol, although robust, may lose spectrum-specific statistics and might require a new calibration when changes in experimental setup occur. To mitigate these issues, we developed a new MS/MS search tool, RAId_aPS, that is able to provide spectrum-specific E-values for additive scoring functions. Given a selection of scoring functions out of RAId score, K-score, Hyperscore and XCorr, RAId_aPS generates the corresponding score histograms of all possible peptides using dynamic programming. Using these score histograms to assign E-values enables a calibration-free protocol for accurate significance assignment for each scoring function. RAId_aPS features four different modes: (i) compute the total number of possible peptides for a given molecular mass range, (ii) generate the score histogram given a MS/MS spectrum and a scoring function, (iii) reassign E-values for a list of candidate peptides given a MS/MS spectrum and the scoring functions chosen, and (iv) perform database searches using selected scoring functions. In modes (iii) and (iv), RAId_aPS is also capable of combining results from different scoring functions using spectrum-specific statistics. The web link is http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/raid_aps/index.html. Relevant binaries for Linux, Windows, and Mac OS X are available from the same page.
93 - Yi-Kuo Yu 2009
A rigorous derivation of the density functional in the Hohenberg-Kohn theory is presented. With no assumption regarding the magnitude of the electric coupling constant $e^2$ (or correlation), this work provides a firm basis for first-principles calcu lations. Using the auxiliary field method, in which $e^2$ need not be small, we show that the bosonic loop expansion of the exchange-correlation functional can be reorganized so as to be expressed entirely in terms of the Kohn-Sham single-particle orbitals and energies. The excitations of the many-particle system can be obtained within the same formalism. We also explicitly demonstrate at zero-temperature the single-particle limit, the weak-coupling limit of the energy functional, and its application to homogeneous electron gas.
67 - Yi-Kuo Yu 2009
A rigorous derivation of the density functional via the effective action in the Hohenberg-Kohn theory is outlined. Using the auxiliary field method, in which the electric coupling constant $e^2$ need not be small, we show that the loop expansion of t he exchange-correlation functional can be reorganized so as to be expressed entirely in terms of the Kohn-Sham single-particle orbitals and energies.
119 - Gelio Alves , Yi-Kuo Yu 2008
We provide a complete thermodynamic solution of a 1D hopping model in the presence of a random potential by obtaining the density of states. Since the partition function is related to the density of states by a Laplace transform, the density of state s determines completely the thermodynamic behavior of the system. We have also shown that the transfer matrix technique, or the so-called dynamic programming, used to obtain the density of states in the 1D hopping model may be generalized to tackle a long-standing problem in statistical significance assessment for one of the most important proteomic tasks - peptide sequencing using tandem mass spectrometry data.
109 - Gelio Alves , Aleksey Ogurtsov , 2008
Summary: In anticipation of the individualized proteomics era and the need to integrate knowledge from disease studies, we have augmented our peptide identification software RAId DbS to take into account annotated single amino acid polymorphisms, pos t-translational modifications, and their documented disease associations while analyzing a tandem mass spectrum. To facilitate new discoveries, RAId DbS allows users to conduct searches permitting novel polymorphisms. Availability: The webserver link is http://www.ncbi.nlm.nih.gov/ /CBBResearch/qmbp/raid dbs/index.html. The relevant databases and binaries of RAId DbS for Linux, Windows, and Mac OS X are available from the same web page. Contact: [email protected]
Using heat conduction mechanism on a social network we develop a systematic method to predict missing values as recommendations. This method can treat very large matrices that are typical of internet communities. In particular, with an innovative, ex act formulation that accommodates arbitrary boundary condition, our method is easy to use in real applications. The performance is assessed by comparing with traditional recommendation methods using real data.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا