أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Rui Zeng

Predictive Optimal Control with Data-Based Disturbance Scenario Tree Approximation

120 - Ran Jing , Xiangrui Zeng 2021

Efficiently computing the optimal control policy concerning a complicated future with stochastic disturbance has always been a challenge. The predicted stochastic future disturbance can be represented by a scenario tree, but solving the optimal contr ol problem with a scenario tree is usually computationally demanding. In this paper, we propose a data-based clustering approximation method for the scenario tree representation. Differently from the popular Markov chain approximation, the proposed method can retain information from previous steps while keeping the state space size small. Then the predictive optimal control problem can be approximately solved with reduced computational load using dynamic programming. The proposed method is evaluated in numerical examples and compared with the method which considers the disturbance as a non-stationary Markov chain. The results show that the proposed method can achieve better control performance than the Markov chain method.

أنظمة وتحكم أنظمة وتحكم

Feedback-Based Dynamic Feature Selection for Constrained Continuous Data Acquisition

130 - Alp Sahin , Xiangrui Zeng 2020

Relevant and high-quality data are critical to successful development of machine learning applications. For machine learning applications on dynamic systems equipped with a large number of sensors, such as connected vehicles and robots, how to find r elevant and high-quality data features in an efficient way is a challenging problem. In this work, we address the problem of feature selection in constrained continuous data acquisition. We propose a feedback-based dynamic feature selection algorithm that efficiently decides on the feature set for data collection from a dynamic system in a step-wise manner. We formulate the sequential feature selection procedure as a Markov Decision Process. The machine learning model performance feedback with an exploration component is used as the reward function in an $epsilon$-greedy action selection. Our evaluation shows that the proposed feedback-based feature selection algorithm has superior performance over constrained baseline methods and matching performance with unconstrained baseline methods.

التعلم الآلي أنظمة وتحكم أنظمة وتحكم

Online Alternate Generator against Adversarial Attacks

69 - Haofeng Li , Yirui Zeng , Guanbin Li 2020

The field of computer vision has witnessed phenomenal progress in recent years partially due to the development of deep convolutional neural networks. However, deep learning models are notoriously sensitive to adversarial examples which are synthesiz ed by adding quasi-perceptible noises on real images. Some existing defense methods require to re-train attacked target networks and augment the train set via known adversarial attacks, which is inefficient and might be unpromising with unknown attack types. To overcome the above issues, we propose a portable defense method, online alternate generator, which does not need to access or modify the parameters of the target networks. The proposed method works by online synthesizing another image from scratch for an input image, instead of removing or destroying adversarial noises. To avoid pretrained parameters exploited by attackers, we alternately update the generator and the synthesized image at the inference stage. Experimental results demonstrate that the proposed defensive scheme and method outperforms a series of state-of-the-art defending models against gray-box adversarial attacks.

الرؤية الحاسوبية وتمييز الأنماط

Few shot domain adaptation for in situ macromolecule structural classification in cryo-electron tomograms

184 - Liangyong Yu , Ran Li , Xiangrui Zeng 2020

Motivation: Cryo-Electron Tomography (cryo-ET) visualizes structure and spatial organization of macromolecules and their interactions with other subcellular components inside single cells in the close-to-native state at sub-molecular resolution. Such information is critical for the accurate understanding of cellular processes. However, subtomogram classification remains one of the major challenges for the systematic recognition and recovery of the macromolecule structures in cryo-ET because of imaging limits and data quantity. Recently, deep learning has significantly improved the throughput and accuracy of large-scale subtomogram classification. However often it is difficult to get enough high-quality annotated subtomogram data for supervised training due to the enormous expense of labeling. To tackle this problem, it is beneficial to utilize another already annotated dataset to assist the training process. However, due to the discrepancy of image intensity distribution between source domain and target domain, the model trained on subtomograms in source domainmay perform poorly in predicting subtomogram classes in the target domain. Results: In this paper, we adapt a few shot domain adaptation method for deep learning based cross-domain subtomogram classification. The essential idea of our method consists of two parts: 1) take full advantage of the distribution of plentiful unlabeled target domain data, and 2) exploit the correlation between the whole source domain dataset and few labeled target domain data. Experiments conducted on simulated and real datasets show that our method achieves significant improvement on cross domain subtomogram classification compared with baseline methods.

الأساليب الكمية الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

MTRNet++: One-stage Mask-based Scene Text Eraser

123 - Osman Tursun , Simon Denman , Rui Zeng 2019

A precise, controllable, interpretable and easily trainable text removal approach is necessary for both user-specific and large-scale text removal applications. To achieve this, we propose a one-stage mask-based text inpainting network, MTRNet++. It has a novel architecture that includes mask-refine, coarse-inpainting and fine-inpainting branches, and attention blocks. With this architecture, MTRNet++ can remove text either with or without an external mask. It achieves state-of-the-art results on both the Oxford and SCUT datasets without using external ground-truth masks. The results of ablation studies demonstrate that the proposed multi-branch architecture with attention blocks is effective and essential. It also demonstrates controllability and interpretability.

الرؤية الحاسوبية وتمييز الأنماط

AITom: Open-source AI platform for cryo-electron tomography data analysis

343 - Xiangrui Zeng , Min Xu 2019

Cryo-electron tomography (cryo-ET) is an emerging technology for the 3D visualization of structural organizations and interactions of subcellular components at near-native state and sub-molecular resolution. Tomograms captured by cryo-ET contain hete rogeneous structures representing the complex and dynamic subcellular environment. Since the structures are not purified or fluorescently labeled, the spatial organization and interaction between both the known and unknown structures can be studied in their native environment. The rapid advances of cryo-electron tomography (cryo-ET) have generated abundant 3D cellular imaging data. However, the systematic localization, identification, segmentation, and structural recovery of the subcellular components require efficient and accurate large-scale image analysis methods. We introduce AITom, an open-source artificial intelligence platform for cryo-ET researchers. AITom provides many public as well as in-house algorithms for performing cryo-ET data analysis through both the traditional template-based or template-free approach and the deep learning approach. AITom also supports remote interactive analysis. Comprehensive tutorials for each analysis module are provided to guide the user through. We welcome researchers and developers to join this collaborative open-source software development project. Availability: https://github.com/xulabs/aitom

الأساليب الكمية التعلم الآلي معالجة الصور والفيديو

Adversarial Pulmonary Pathology Translation for Pairwise Chest X-ray Data Augmentation

97 - Yunyan Xing , Zongyuan Ge , Rui Zeng 2019

Recent works show that Generative Adversarial Networks (GANs) can be successfully applied to chest X-ray data augmentation for lung disease recognition. However, the implausible and distorted pathology features generated from the less than perfect ge nerator may lead to wrong clinical decisions. Why not keep the original pathology region? We proposed a novel approach that allows our generative model to generate high quality plausible images that contain undistorted pathology areas. The main idea is to design a training scheme based on an image-to-image translation network to introduce variations of new lung features around the pathology ground-truth area. Moreover, our model is able to leverage both annotated disease images and unannotated healthy lung images for the purpose of generation. We demonstrate the effectiveness of our model on two tasks: (i) we invite certified radiologists to assess the quality of the generated synthetic images against real and other state-of-the-art generative models, and (ii) data augmentation to improve the performance of disease localisation.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

CS Sparse K-means: An Algorithm for Cluster-Specific Feature Selection in High-Dimensional Clustering

71 - Xiangrui Zeng , Hongyu Zheng 2019

Feature selection is an important and challenging task in high dimensional clustering. For example, in genomics, there may only be a small number of genes that are differentially expressed, which are informative to the overall clustering structure. E xisting feature selection methods, such as Sparse K-means, rarely tackle the problem of accounting features that can only separate a subset of clusters. In genomics, it is highly likely that a gene can only define one subtype against all the other subtypes or distinguish a pair of subtypes but not others. In this paper, we propose a K-means based clustering algorithm that discovers informative features as well as which cluster pairs are separable by each selected features. The method is essentially an EM algorithm, in which we introduce lasso-type constraints on each cluster pair in the M step, and make the E step possible by maximizing the raw cross-cluster distance instead of minimizing the intra-cluster distance. The results were demonstrated on simulated data and a leukemia gene expression dataset.

المنهجية التعلم الآلي التعلم الالي

Simultaneous Estimation of Number of Clusters and Feature Sparsity in Clustering High-Dimensional Data

205 - Yujia Li , Xiangrui Zeng , Chien-Wei Lin 2019

Estimating the number of clusters (K) is a critical and often difficult task in cluster analysis. Many methods have been proposed to estimate K, including some top performers using resampling approach. When performing cluster analysis in high-dimensi onal data, simultaneous clustering and feature selection is needed for improved interpretation and performance. To our knowledge, none has investigated simultaneous estimation of K and feature selection in an exploratory cluster analysis. In this paper, we propose a resampling method to meet this gap and evaluate its performance under the sparse K-means clustering framework. The proposed target function balances between sensitivity and specificity of clustering evaluation of pairwise subjects from clustering of full and subsampled data. Through extensive simulations, the method performs among the best over classical methods in estimating K in low-dimensional data. For high-dimensional simulation data, it also shows superior performance to simultaneously estimate K and feature sparsity parameter. Finally, we evaluated the methods in four microarray, two RNA-seq, one SNP and two non-omics datasets. The proposed method achieves better clustering accuracy with fewer selected predictive genes in almost all real applications.

المنهجية

Deep Learning-Based Strategy for Macromolecules Classification with Imbalanced Data from Cellular Electron Cryotomography

76 - Ziqian Luo , Xiangrui Zeng , Zhipeng Bao 2019

Deep learning model trained by imbalanced data may not work satisfactorily since it could be determined by major classes and thus may ignore the classes with small amount of data. In this paper, we apply deep learning based imbalanced data classifica tion for the first time to cellular macromolecular complexes captured by Cryo-electron tomography (Cryo-ET). We adopt a range of strategies to cope with imbalanced data, including data sampling, bagging, boosting, Genetic Programming based method and. Particularly, inspired from Inception 3D network, we propose a multi-path CNN model combining focal loss and mixup on the Cryo-ET dataset to expand the dataset, where each path had its best performance corresponding to each type of data and let the network learn the combinations of the paths to improve the classification performance. In addition, extensive experiments have been conducted to show our proposed method is flexible enough to cope with different number of classes by adjusting the number of paths in our multi-path model. To our knowledge, this work is the first application of deep learning methods of dealing with imbalanced data to the internal tissue classification of cell macromolecular complexes, which opened up a new path for cell classification in the field of computational biology.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد