أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Yue Zhang

Automatic hippocampal surface generation via 3D U-net and active shape modeling with hybrid particle swarm optimization

117 - Pinyuan Zhong , Yue Zhang , Xiaoying Tang 2021

In this paper, we proposed and validated a fully automatic pipeline for hippocampal surface generation via 3D U-net coupled with active shape modeling (ASM). Principally, the proposed pipeline consisted of three steps. In the beginning, for each magn etic resonance image, a 3D U-net was employed to obtain the automatic hippocampus segmentation at each hemisphere. Secondly, ASM was performed on a group of pre-obtained template surfaces to generate mean shape and shape variation parameters through principal component analysis. Ultimately, hybrid particle swarm optimization was utilized to search for the optimal shape variation parameters that best match the segmentation. The hippocampal surface was then generated from the mean shape and the shape variation parameters. The proposed pipeline was observed to provide hippocampal surfaces at both hemispheres with high accuracy, correct anatomical topology, and sufficient smoothness.

الحوسبة العصبية والتطورية الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Smelting Gold and Silver for Improved Multilingual AMR-to-Text Generation

96 - Leonardo F. R. Ribeiro , Jonas Pfeiffer , Yue Zhang 2021

Recent work on multilingual AMR-to-text generation has exclusively focused on data augmentation strategies that utilize silver AMR. However, this assumes a high quality of generated AMRs, potentially limiting the transferability to the target task. I n this paper, we investigate different techniques for automatically generating AMR annotations, where we aim to study which source of information yields better multilingual results. Our models trained on gold AMR with silver (machine translated) sentences outperform approaches which leverage generated silver AMR. We find that combining both complementary sources of information further improves multilingual AMR-to-text generation. Our models surpass the previous state of the art for German, Italian, Spanish, and Chinese by a large margin.

الحساب واللغة

Unsupervised clothing change adaptive person ReID

109 - Ziyue Zhang , Shuai Jiang , Congzhentao Huang 2021

Clothing changes and lack of data labels are both crucial challenges in person ReID. For the former challenge, people may occur multiple times at different locations wearing different clothing. However, most of the current person ReID research works focus on the benchmarks in which a persons clothing is kept the same all the time. For the last challenge, some researchers try to make model learn information from a labeled dataset as a source to an unlabeled dataset. Whereas purely unsupervised training is less used. In this paper, we aim to solve both problems at the same time. We design a novel unsupervised model, Sync-Person-Cloud ReID, to solve the unsupervised clothing change person ReID problem. We developer a purely unsupervised clothing change person ReID pipeline with person sync augmentation operation and same person feature restriction. The person sync augmentation is to supply additional same person resources. These same persons resources can be used as part supervised input by same person feature restriction. The extensive experiments on clothing change ReID datasets show the out-performance of our methods.

الرؤية الحاسوبية وتمييز الأنماط

ANOMALYMAXQ:Anomaly-Structured Maximization to Query in Attributed Network

91 - Xinyue Zhang , Nannan Wu , Zixu Zhen 2021

The detection of anomaly subgraphs naturally appears in various real-life tasks, yet label noise seriously interferes with the result. As a motivation for our work, we focus on inaccurate supervision and use prior knowledge to reduce effects of noise , like query graphs. Anomalies in attributed networks exhibit structured-properties, e.g., anomaly in money laundering with ring structure property. It is the main challenge to fast and approximate query anomaly in attributed networks. We propose a novel search method: 1) decomposing a query graph into stars; 2) sorting attributed vertices; and 3) assembling anomaly stars under the root vertex sequence into near query. We present ANOMALYMAXQ and perform on 68,411 company network (Tianyancha dataset),7.72m patent networks (Company patents) and so on. Extensive experiments show that our method has high robustness and fast response time. When running the patent dataset,the average running time to query the graph once is about 252 seconds.

بنى وهياكل البيانات والخوارزميات التحليل العددي التحليل العددي

Complementary Patch for Weakly Supervised Semantic Segmentation

106 - Fei Zhang , Chaochen Gu , Chenyue Zhang 2021

Weakly Supervised Semantic Segmentation (WSSS) based on image-level labels has been greatly advanced by exploiting the outputs of Class Activation Map (CAM) to generate the pseudo labels for semantic segmentation. However, CAM merely discovers seeds from a small number of regions, which may be insufficient to serve as pseudo masks for semantic segmentation. In this paper, we formulate the expansion of object regions in CAM as an increase in information. From the perspective of information theory, we propose a novel Complementary Patch (CP) Representation and prove that the information of the sum of the CAMs by a pair of input images with complementary hidden (patched) parts, namely CP Pair, is greater than or equal to the information of the baseline CAM. Therefore, a CAM with more information related to object seeds can be obtained by narrowing down the gap between the sum of CAMs generated by the CP Pair and the original CAM. We propose a CP Network (CPN) implemented by a triplet network and three regularization functions. To further improve the quality of the CAMs, we propose a Pixel-Region Correlation Module (PRCM) to augment the contextual information by using object-region relations between the feature maps and the CAMs. Experimental results on the PASCAL VOC 2012 datasets show that our proposed method achieves a new state-of-the-art in WSSS, validating the effectiveness of our CP Representation and CPN.

الرؤية الحاسوبية وتمييز الأنماط

Adaptive Normalized Representation Learning for Generalizable Face Anti-Spoofing

153 - Shubao Liu , Ke-Yue Zhang , Taiping Yao 2021

With various face presentation attacks arising under unseen scenarios, face anti-spoofing (FAS) based on domain generalization (DG) has drawn growing attention due to its robustness. Most existing methods utilize DG frameworks to align the features t o seek a compact and generalized feature space. However, little attention has been paid to the feature extraction process for the FAS task, especially the influence of normalization, which also has a great impact on the generalization of the learned representation. To address this issue, we propose a novel perspective of face anti-spoofing that focuses on the normalization selection in the feature extraction process. Concretely, an Adaptive Normalized Representation Learning (ANRL) framework is devised, which adaptively selects feature normalization methods according to the inputs, aiming to learn domain-agnostic and discriminative representation. Moreover, to facilitate the representation learning, Dual Calibration Constraints are designed, including Inter-Domain Compatible loss and Inter-Class Separable loss, which provide a better optimization direction for generalizable representation. Extensive experiments and visualizations are presented to demonstrate the effectiveness of our method against the SOTA competitors.

الرؤية الحاسوبية وتمييز الأنماط

Two New Stenoses Detection Methods of Coronary Angiograms

189 - Yaofang Liu , Xinyue Zhang , Wenlong Wan 2021

Coronary angiography is the gold standard for the diagnosis of coronary heart disease. At present, the methods for detecting coronary artery stenoses and evaluating the degree of it in coronary angiograms are either subjective or not efficient enough . Two vascular stenoses detection methods in coronary angiograms are proposed to assist the diagnosis. The first one is an automatic method, which can automatically segment the entire coronary vessels and mark the stenoses. The second one is an interactive method. With this method, the user only needs to give a start point and an end point to detect the stenoses of a certain vascular segment. We have shown that the proposed tracking methods are robust for angiograms with various vessel structure. The automatic detection method can effectively measure the diameter of the vessel and mark the stenoses in different angiograms. Further investigation proves that the results of interactive detection method can accurately reflect the true stenoses situation. The proposed automatic method and interactive method are effective in various angiograms and can complement each other in clinical practice. The first method can be used for preliminary screening and the second method can be used for further quantitative analysis. It has the potential to improve the level of clinical diagnosis of coronary heart disease.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

Multi-phase Liver Tumor Segmentation with Spatial Aggregation and Uncertain Region Inpainting

216 - Yue Zhang , Chengtao Peng , Liying Peng 2021

Multi-phase computed tomography (CT) images provide crucial complementary information for accurate liver tumor segmentation (LiTS). State-of-the-art multi-phase LiTS methods usually fused cross-phase features through phase-weighted summation or chann el-attention based concatenation. However, these methods ignored the spatial (pixel-wise) relationships between different phases, hence leading to insufficient feature integration. In addition, the performance of existing methods remains subject to the uncertainty in segmentation, which is particularly acute in tumor boundary regions. In this work, we propose a novel LiTS method to adequately aggregate multi-phase information and refine uncertain region segmentation. To this end, we introduce a spatial aggregation module (SAM), which encourages per-pixel interactions between different phases, to make full use of cross-phase information. Moreover, we devise an uncertain region inpainting module (URIM) to refine uncertain pixels using neighboring discriminative features. Experiments on an in-house multi-phase CT dataset of focal liver lesions (MPCT-FLLs) demonstrate that our method achieves promising liver tumor segmentation and outperforms state-of-the-arts.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

Understanding the merging behavior patterns and evolutionary mechanism at freeway on-ramps

113 - Yue Zhang , Yajie Zou , Lingtao Wuand Wanbing Han 2021

Understanding the merging behavior patterns at freeway on-ramps is important for assistanting the decisions of autonomous driving. This study develops a primitive-based framework to identify the driving patterns during merging processes and reveal th e evolutionary mechanism at freeway on-ramps in congested traffic flow. The Nonhomogeneous Hidden Markov Model is introduced to decompose the merging processes into primitives containing semantic information. Then, the time-series K-means clustering is utilized to gather these primitives with variable-length time series into interpretable merging behavior patterns. Different from traditional state segmentation methods (e.g. Hidden Markov Model), the model proposed in this study considers the dependence of transition probability on exogenous variables, thereby revealing the influence of covariates on the evolution of driving patterns. This approach is evaluated in the merging area at a freeway on-ramp using the INTERACTION dataset. Results demonstrate that the approach provides an insight about the complicated merging processes. The findings about interpretable merging behavior patterns as well as the evolutionary mechanism can be used to design and improve the merging decision-making for autonomous vehicles.

معالجة الإشارات

ChrEnTranslate: Cherokee-English Machine Translation Demo with Quality Estimation and Corrective Feedback

117 - Shiyue Zhang , Benjamin Frey , Mohit Bansal 2021

We introduce ChrEnTranslate, an online machine translation demonstration system for translation between English and an endangered language Cherokee. It supports both statistical and neural translation models as well as provides quality estimation to inform users of reliability, two user feedback interfaces for experts and common users respectively, example inputs to collect human translations for monolingual data, word alignment visualization, and relevant terms from the Cherokee-English dictionary. The quantitative evaluation demonstrates that our backbone translation models achieve state-of-the-art translation performance and our quality estimation well correlates with both BLEU and human judgment. By analyzing 216 pieces of expert feedback, we find that NMT is preferable because it copies less than SMT, and, in general, current models can translate fragments of the source sentence but make major mistakes. When we add these 216 expert-corrected parallel texts back into the training set and retrain models, equal or slightly better performance is observed, which indicates the potential of human-in-the-loop learning. Our online demo is at https://chren.cs.unc.edu/ , our code is open-sourced at https://github.com/ZhangShiyue/ChrEnTranslate , and our data is available at https://github.com/ZhangShiyue/ChrEn

الحساب واللغة الذكاء الاصطناعي

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد