ﻻ يوجد ملخص باللغة العربية
The recent advances in single-cell technologies have enabled us to profile genomic features at unprecedented resolution and datasets from multiple domains are available, including datasets that profile different types of genomic features and datasets that profile the same type of genomic features across different species. These datasets typically have different powers in identifying the unknown cell types through clustering, and data integration can potentially lead to a better performance of clustering algorithms. In this work, we formulate the problem in an unsupervised transfer learning framework, which utilizes knowledge learned from auxiliary dataset to improve the clustering performance of target dataset. The degree of shared information among the target and auxiliary datasets can vary, and their distributions can also be different. To address these challenges, we propose an elastic coupled co-clustering based transfer learning algorithm, by elastically propagating clustering knowledge obtained from the auxiliary dataset to the target dataset. Implementation on single-cell genomic datasets shows that our algorithm greatly improves clustering performance over the traditional learning algorithms. The source code and data sets are available at https://github.com/cuhklinlab/elasticC3.
Spectral clustering is one of the fundamental unsupervised learning methods widely used in data analysis. Sparse spectral clustering (SSC) imposes sparsity to the spectral clustering and it improves the interpretability of the model. This paper consi
Predictive modeling based on genomic data has gained popularity in biomedical research and clinical practice by allowing researchers and clinicians to identify biomarkers and tailor treatment decisions more efficiently. Analysis incorporating pathway
Comparing and aligning large datasets is a pervasive problem occurring across many different knowledge domains. We introduce and study MREC, a recursive decomposition algorithm for computing matchings between data sets. The basic idea is to partition
In this paper, we present a novel method for co-clustering, an unsupervised learning approach that aims at discovering homogeneous groups of data instances and features by grouping them simultaneously. The proposed method uses the entropy regularized
In this paper, we propose a simple algorithm to cluster nonnegative data lying in disjoint subspaces. We analyze its performance in relation to a certain measure of correlation between said subspaces. We use our clustering algorithm to develop a matr