ﻻ يوجد ملخص باللغة العربية
The development of single-cell technologies provides the opportunity to identify new cellular states and reconstruct novel cell-to-cell relationships. Applications range from understanding the transcriptional and epigenetic processes involved in metazoan development to characterizing distinct cells types in heterogeneous populations like cancers or immune cells. However, analysis of the data is impeded by its unknown intrinsic biological and technical variability together with its sparseness; these factors complicate the identification of true biological signals amidst artifact and noise. Here we show that, across technologies, roughly 95% of the eigenvalues derived from each single-cell data set can be described by universal distributions predicted by Random Matrix Theory. Interestingly, 5% of the spectrum shows deviations from these distributions and present a phenomenon known as eigenvector localization, where information tightly concentrates in groups of cells. Some of the localized eigenvectors reflect underlying biological signal, and some are simply a consequence of the sparsity of single cell data; roughly 3% is artifactual. Based on the universal distributions and a technique for detecting sparsity induced localization, we present a strategy to identify the residual 2% of directions that encode biological information and thereby denoise single-cell data. We demonstrate the effectiveness of this approach by comparing with standard single-cell data analysis techniques in a variety of examples with marked cell populations.
Single cell responses are shaped by the geometry of signaling kinetic trajectories carved in a multidimensional space spanned by signaling protein abundances. It is however challenging to assay large number (>3) of signaling species in live-cell imag
The existence of doublets is a key confounder in single-cell RNA sequencing (scRNA-seq) data analysis. Computational methods have been developed for detecting doublets from scRNA-seq data. We developed an R package DoubletCollection to integrate the
According to the National Cancer Institute, there were 9.5 million cancer-related deaths in 2018. A challenge in improving treatment is resistance in genetically unstable cells. The purpose of this study is to evaluate unsupervised machine learning o
The collection of immunoglobulin genes in an individuals germline, which gives rise to B cell receptors via recombination, is known to vary significantly across individuals. In humans, for example, each individual has only a fraction of the several h
We develop a theoretical approach that uses physiochemical kinetics modelling to describe cell population dynamics upon progression of viral infection in cell culture, which results in cell apoptosis (programmed cell death) and necrosis (direct cell