أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Wei Zhu

Learning to Aggregate and Refine Noisy Labels for Visual Sentiment Analysis

82 - Wei Zhu , Zihe Zheng , Haitian Zheng 2021

Visual sentiment analysis has received increasing attention in recent years. However, the quality of the dataset is a concern because the sentiment labels are crowd-sourcing, subjective, and prone to mistakes. This poses a severe threat to the data-d riven models including the deep neural networks which would generalize poorly on the testing cases if they are trained to over-fit the samples with noisy sentiment labels. Inspired by the recent progress on learning with noisy labels, we propose a robust learning method to perform robust visual sentiment analysis. Our method relies on an external memory to aggregate and filter noisy labels during training and thus can prevent the model from overfitting the noisy cases. The memory is composed of the prototypes with corresponding labels, both of which can be updated online. We establish a benchmark for visual sentiment analysis with label noise using publicly available datasets. The experiment results of the proposed benchmark settings comprehensively show the effectiveness of our method.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Federated Learning of Molecular Properties in a Heterogeneous Setting

293 - Wei Zhu , Andrew White , Jiebo Luo 2021

Chemistry research has both high material and computational costs to conduct experiments. Institutions thus consider chemical data to be valuable and there have been few efforts to construct large public datasets for machine learning. Another challen ge is that different intuitions are interested in different classes of molecules, creating heterogeneous data that cannot be easily joined by conventional distributed training. In this work, we introduce federated heterogeneous molecular learning to address these challenges. Federated learning allows end-users to build a global model collaboratively while preserving the training data distributed over isolated clients. Due to the lack of related research, we first simulate a federated heterogeneous benchmark called FedChem. FedChem is constructed by jointly performing scaffold splitting and Latent Dirichlet Allocation on existing datasets. Our results on FedChem show that significant learning challenges arise when working with heterogeneous molecules. We then propose a method to alleviate the problem, namely Federated Learning by Instance reweighTing (FLIT). FLIT can align the local training across heterogeneous clients by improving the performance for uncertain samples. Comprehensive experiments conducted on our new benchmark FedChem validate the advantages of this method over other federated learning schemes. FedChem should enable a new type of collaboration for improving AI in chemistry that mitigates concerns about valuable chemical data.

التعلم الآلي الفيزياء الكيميائية

Learning Inner-Group Relations on Point Clouds

118 - Haoxi Ran , Wei Zhuo , Jun Liu 2021

The prevalence of relation networks in computer vision is in stark contrast to underexplored point-based methods. In this paper, we explore the possibilities of local relation operators and survey their feasibility. We propose a scalable and efficien t module, called group relation aggregator. The module computes a feature of a group based on the aggregation of the features of the inner-group points weighted by geometric relations and semantic relations. We adopt this module to design our RPNet. We further verify the expandability of RPNet, in terms of both depth and width, on the tasks of classification and segmentation. Surprisingly, empirical results show that wider RPNet fits for classification, while deeper RPNet works better on segmentation. RPNet achieves state-of-the-art for classification and segmentation on challenging benchmarks. We also compare our local aggregator with PointNet++, with around 30% parameters and 50% computation saving. Finally, we conduct experiments to reveal the robustness of RPNet with regard to rigid transformation and noises.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي الرسم الحاسوبي

Learning Bias-Invariant Representation by Cross-Sample Mutual Information Minimization

149 - Wei Zhu , Haitian Zheng , Haofu Liao 2021

Deep learning algorithms mine knowledge from the training data and thus would likely inherit the datasets bias information. As a result, the obtained model would generalize poorly and even mislead the decision process in real-life applications. We pr opose to remove the bias information misused by the target task with a cross-sample adversarial debiasing (CSAD) method. CSAD explicitly extracts target and bias features disentangled from the latent representation generated by a feature extractor and then learns to discover and remove the correlation between the target and bias features. The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator. Moreover, we propose joint content and local structural representation learning to boost mutual information estimation for better performance. We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.

التعلم الآلي الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

Entanglement and mixed states of Young tableau states in gauge/gravity correspondence

72 - Hai Lin , Yuwei Zhu 2021

We use entangled multimode coherent states to produce entangled giant graviton states, in the context of gauge/gravity duality. We make a smeared distribution of the entangled multimode coherent states on the circle, or on the five-sphere, in the hig her dimensional view. In gauge/gravity duality, we analyze the superposition of giant graviton states, and the entangled pairs of giant graviton states. We map a class of angular distribution functions to unitary operations on the pairs. We also use Young tableau states to construct cat states and qudit states. Various bipartite quantum states involving Young tableau states are analyzed, including micro-macro entangled states. Mixed states of Young tableau states are generated, by using ensemble mixing using angular distribution functions, and also by going through noisy quantum channels. We then produce mixed entangled pair of giant graviton states, by including interaction with the environment and using noisy quantum channels.

الفيزياء عالية الطاقة - النظرية

Cross-Platform Comparison of Arbitrary Quantum Computations

105 - Daiwei Zhu , Ze-Pei Cian , Crystal Noel 2021

As we approach the era of quantum advantage, when quantum computers (QCs) can outperform any classical computer on particular tasks, there remains the difficult challenge of how to validate their performance. While algorithmic success can be easily v erified in some instances such as number factoring or oracular algorithms, these approaches only provide pass/fail information for a single QC. On the other hand, a comparison between different QCs on the same arbitrary circuit provides a lower-bound for generic validation: a quantum computation is only as valid as the agreement between the results produced on different QCs. Such an approach is also at the heart of evaluating metrological standards such as disparate atomic clocks. In this paper, we report a cross-platform QC comparison using randomized and correlated measurements that results in a wealth of information on the QC systems. We execute several quantum circuits on widely different physical QC platforms and analyze the cross-platform fidelities.

فيزياء الكم

TWAG: A Topic-Guided Wikipedia Abstract Generator

145 - Fangwei Zhu , Shangqing Tu , Jiaxin Shi 2021

Wikipedia abstract generation aims to distill a Wikipedia abstract from web sources and has met significant success by adopting multi-document summarization techniques. However, previous works generally view the abstract as plain text, ignoring the f act that it is a description of a certain entity and can be decomposed into different topics. In this paper, we propose a two-stage model TWAG that guides the abstract generation with topical information. First, we detect the topic of each input paragraph with a classifier trained on existing Wikipedia articles to divide input documents into different topics. Then, we predict the topic distribution of each abstract sentence, and decode the sentence from topic-aware representations with a Pointer-Generator network. We evaluate our model on the WikiCatSum dataset, and the results show that modelnames outperforms various existing baselines and is capable of generating comprehensive abstracts. Our code and dataset can be accessed at url{https://github.com/THU-KEG/TWAG}

الحساب واللغة

Bootstrap Representation Learning for Segmentation on Medical Volumes and Sequences

64 - Zejian Chen , Wei Zhuo , Tianfu Wang 2021

In this work, we propose a novel straightforward method for medical volume and sequence segmentation with limited annotations. To avert laborious annotating, the recent success of self-supervised learning(SSL) motivates the pre-training on unlabeled data. Despite its success, it is still challenging to adapt typical SSL methods to volume/sequence segmentation, due to their lack of mining on local semantic discrimination and rare exploitation on volume and sequence structures. Based on the continuity between slices/frames and the common spatial layout of organs across volumes/sequences, we introduced a novel bootstrap self-supervised representation learning method by leveraging the predictable possibility of neighboring slices. At the core of our method is a simple and straightforward dense self-supervision on the predictions of local representations and a strategy of predicting locals based on global context, which enables stable and reliable supervision for both global and local representation mining among volumes. Specifically, we first proposed an asymmetric network with an attention-guided predictor to enforce distance-specific prediction and supervision on slices within and across volumes/sequences. Secondly, we introduced a novel prototype-based foreground-background calibration module to enhance representation consistency. The two parts are trained jointly on labeled and unlabeled data. When evaluated on three benchmark datasets of medical volumes and sequences, our model outperforms existing methods with a large margin of 4.5% DSC on ACDC, 1.7% on Prostate, and 2.3% on CAMUS. Intensive evaluations reveals the effectiveness and superiority of our method.

الرؤية الحاسوبية وتمييز الأنماط

Perception-Aware Multi-Sensor Fusion for 3D LiDAR Semantic Segmentation

141 - Zhuangwei Zhuang , Rong Li , Kui Jia 2021

3D LiDAR (light detection and ranging) semantic segmentation is important in scene understanding for many applications, such as auto-driving and robotics. For example, for autonomous cars equipped with RGB cameras and LiDAR, it is crucial to fuse com plementary information from different sensors for robust and accurate segmentation. Existing fusion-based methods, however, may not achieve promising performance due to the vast difference between the two modalities. In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF) to exploit perceptual information from two modalities, namely, appearance information from RGB images and spatio-depth information from point clouds. To this end, we first project point clouds to the camera coordinates to provide spatio-depth information for RGB images. Then, we propose a two-stream network to extract features from the two modalities, separately, and fuse the features by effective residual-based fusion modules. Moreover, we propose additional perception-aware losses to measure the perceptual difference between the two modalities. Extensive experiments on two benchmark data sets show the superiority of our method. For example, on nuScenes, our PMF outperforms the state-of-the-art method by 0.8 in mIoU.

الرؤية الحاسوبية وتمييز الأنماط

HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping

191 - Yuhan Wang , Xu Chen , Junwei Zhu 2021

In this work, we propose a high fidelity face swapping method, called HifiFace, which can well preserve the face shape of the source face and generate photo-realistic results. Unlike other existing face swapping works that only use face recognition m odel to keep the identity similarity, we propose 3D shape-aware identity to control the face shape with the geometric supervision from 3DMM and 3D face reconstruction method. Meanwhile, we introduce the Semantic Facial Fusion module to optimize the combination of encoder and decoder features and make adaptive blending, which makes the results more photo-realistic. Extensive experiments on faces in the wild demonstrate that our method can preserve better identity, especially on the face shape, and can generate more photo-realistic results than previous state-of-the-art methods.

الرؤية الحاسوبية وتمييز الأنماط

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد