أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Xiaoyun Zhang

CaT: Weakly Supervised Object Detection with Category Transfer

142 - Tianyue Cao , Lianyu Du , Xiaoyun Zhang 2021

A large gap exists between fully-supervised object detection and weakly-supervised object detection. To narrow this gap, some methods consider knowledge transfer from additional fully-supervised dataset. But these methods do not fully exploit discrim inative category information in the fully-supervised dataset, thus causing low mAP. To solve this issue, we propose a novel category transfer framework for weakly supervised object detection. The intuition is to fully leverage both visually-discriminative and semantically-correlated category information in the fully-supervised dataset to enhance the object-classification ability of a weakly-supervised detector. To handle overlapping category transfer, we propose a double-supervision mean teacher to gather common category information and bridge the domain gap between two datasets. To handle non-overlapping category transfer, we propose a semantic graph convolutional network to promote the aggregation of semantic features between correlated categories. Experiments are conducted with Pascal VOC 2007 as the target weakly-supervised dataset and COCO as the source fully-supervised dataset. Our category transfer framework achieves 63.5% mAP and 80.3% CorLoc with 5 overlapping categories between two datasets, which outperforms the state-of-the-art methods. Codes are avaliable at https://github.com/MediaBrain-SJTU/CaT.

الرؤية الحاسوبية وتمييز الأنماط

Content Adaptive and Error Propagation Aware Deep Video Compression

187 - Guo Lu , Chunlei Cai , Xiaoyun Zhang 2020

Recently, learning based video compression methods attract increasing attention. However, the previous works suffer from error propagation due to the accumulation of reconstructed error in inter predictive coding. Meanwhile, the previous learning bas ed video codecs are also not adaptive to different video contents. To address these two problems, we propose a content adaptive and error propagation aware video compression system. Specifically, our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame. Based on the learned long-term temporal information, our approach effectively alleviates error propagation in reconstructed frames. More importantly, instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system. The proposed approach updates the parameters for encoder according to the rate-distortion criterion but keeps the decoder unchanged in the inference stage. Therefore, the encoder is adaptive to different video contents and achieves better compression performance by reducing the domain gap between the training and testing datasets. Our method is simple yet effective and outperforms the state-of-the-art learning based video codecs on benchmark datasets without increasing the model size or decreasing the decoding speed.

معالجة الصور والفيديو الرؤية الحاسوبية وتمييز الأنماط

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد