Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Singular Value Decomposition and Principal Component Analysis

382 0 0.0 ( 0 )

Download Cite

Added by Michael E. Wall

Publication date 2002

fields Physics

and research's language is English

Authors Michael E. Wall

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This chapter describes gene expression analysis by Singular Value Decomposition (SVD), emphasizing initial characterization of the data. We describe SVD methods for visualization of gene expression data, representation of the data using a smaller number of variables, and detection of patterns in noisy gene expression data. In addition, we describe the precise relation between SVD analysis and Principal Component Analysis (PCA) when PCA is calculated using the covariance matrix, enabling our descriptions to apply equally well to either method. Our aim is to provide definitions, interpretations, examples, and references that will serve as resources for understanding and extending the application of SVD and PCA to gene expression analysis.

rate research

Randomized algorithms for distributed computation of principal component analysis and singular value decomposition

174 - Huamin Li , Yuval Kluger , 2016

Randomized algorithms provide solutions to two ubiquitous problems: (1) the distributed calculation of a principal component analysis or singular value decomposition of a highly rectangular matrix, and (2) the distributed calculation of a low-rank approximation (in the form of a singular value decomposition) to an arbitrary matrix. Carefully honed algorithms yield results that are uniformly superior to those of the stock, deterministic implementations in Spark (the popular platform for distributed computation); in particular, whereas the stock software will without warning return left singular vectors that are far from numerically orthonormal, a significantly burnished randomized implementation generates left singular vectors that are numerically orthonormal to nearly the machine precision.

Distributed Parallel and Cluster Computing Numerical Analysis Numerical Analysis

Visualizing probabilistic models: Intensive Principal Component Analysis

184 - Katherine N. Quinn , Colin B. Clement , Francesco De Bernardis 2018

Unsupervised learning makes manifest the underlying structure of data without curated training and specific problem definitions. However, the inference of relationships between data points is frustrated by the `curse of dimensionality in high-dimensions. Inspired by replica theory from statistical mechanics, we consider replicas of the system to tune the dimensionality and take the limit as the number of replicas goes to zero. The result is the intensive embedding, which is not only isometric (preserving local distances) but allows global structure to be more transparently visualized. We develop the Intensive Principal Component Analysis (InPCA) and demonstrate clear improvements in visualizations of the Ising model of magnetic spins, a neural network, and the dark energy cold dark matter ({Lambda}CDM) model as applied to the Cosmic Microwave Background.

Statistical Mechanics Data Analysis Statistics and Probability

Suppressing Background Radiation Using Poisson Principal Component Analysis

73 - P. Tandon 2016

Performance of nuclear threat detection systems based on gamma-ray spectrometry often strongly depends on the ability to identify the part of measured signal that can be attributed to background radiation. We have successfully applied a method based on Principal Component Analysis (PCA) to obtain a compact null-space model of background spectra using PCA projection residuals to derive a source detection score. We have shown the methods utility in a threat detection system using mobile spectrometers in urban scenes (Tandon et al 2012). While it is commonly assumed that measured photon counts follow a Poisson process, standard PCA makes a Gaussian assumption about the data distribution, which may be a poor approximation when photon counts are low. This paper studies whether and in what conditions PCA with a Poisson-based loss function (Poisson PCA) can outperform standard Gaussian PCA in modeling background radiation to enable more sensitive and specific nuclear threat detection.

Machine Learning Data Analysis Statistics and Probability Machine Learning

Physical Complexity of Variable Length Symbolic Sequences

374 - Gerard Briscoe , Philippe De Wilde 2011

A measure called Physical Complexity is established and calculated for a population of sequences, based on statistical physics, automata theory, and information theory. It is a measure of the quantity of information in an organisms genome. It is based on Shannons entropy, measuring the information in a population evolved in its environment, by using entropy to estimate the randomness in the genome. It is calculated from the difference between the maximal entropy of the population and the actual entropy of the population when in its environment, estimated by counting the number of fixed loci in the sequences of a population. Up to now, Physical Complexity has only been formulated for populations of sequences with the same length. Here, we investigate an extension to support variable length populations. We then build upon this to construct a measure for the efficiency of information storage, which we later use in understanding clustering within populations. Finally, we investigate our extended Physical Complexity through simulations, showing it to be consistent with the original.

Biological Physics Data Analysis Statistics and Probability Quantitative Methods

Optimizing experimental parameters for tracking of diffusing particles

435 - Christian L. Vestergaard 2015

We describe how a single-particle tracking experiment should be designed in order for its recorded trajectories to contain the most information about a tracked particles diffusion coefficient. The precision of estimators for the diffusion coefficient is affected by motion blur, limited photon statistics, and the length of recorded time-series. We demonstrate for a particle undergoing free diffusion that precision is negligibly affected by motion blur in typical experiments, while optimizing photon counts and the number of recorded frames is the key to precision. Building on these results, we describe for a wide range of experimental scenarios how to choose experimental parameters in order to optimize the precision. Generally, one should choose quantity over quality: experiments should be designed to maximize the number of frames recorded in a time-series, even if this means lower information content in individual frames.

Biological Physics Data Analysis Statistics and Probability Quantitative Methods

comments

Fetching comments

Tishreen University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Singular Value Decomposition and Principal Component Analysis

Ask ChatGPT about the research

No Arabic abstract

Read More