Partial Video Domain Adaptation with Partial Adversarial Temporal Attentive Network

92 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yuecong Xu

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Yuecong Xu - Jianfei Yang - Haozhi Cao

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Partial Domain Adaptation (PDA) is a practical and general domain adaptation scenario, which relaxes the fully shared label space assumption such that the source label space subsumes the target one. The key challenge of PDA is the issue of negative transfer caused by source-only classes. For videos, such negative transfer could be triggered by both spatial and temporal features, which leads to a more challenging Partial Video Domain Adaptation (PVDA) problem. In this paper, we propose a novel Partial Adversarial Temporal Attentive Network (PATAN) to address the PVDA problem by utilizing both spatial and temporal features for filtering source-only classes. Besides, PATAN constructs effective overall temporal features by attending to local temporal features that contribute more toward the class filtration process. We further introduce new benchmarks to facilitate research on PVDA problems, covering a wide range of PVDA scenarios. Empirical results demonstrate the state-of-the-art performance of our proposed PATAN across the multiple PVDA benchmarks.

قيم البحث

334 - Min-Hung Chen , Zsolt Kira , Ghassan AlRegib 2019

Although various image-based domain adaptation (DA) techniques have been proposed in recent years, domain shift in videos is still not well-explored. Most previous works only evaluate performance on small-scale datasets which are saturated. Therefore , we first propose a larger-scale dataset with larger domain discrepancy: UCF-HMDB_full. Second, we investigate different DA integration methods for videos, and show that simultaneously aligning and learning temporal dynamics achieves effective alignment even without sophisticated DA methods. Finally, we propose Temporal Attentive Adversarial Adaptation Network (TA3N), which explicitly attends to the temporal dynamics using domain discrepancy for more effective domain alignment, achieving state-of-the-art performance on three video DA datasets. The code and data are released at http://github.com/cmhungsteve/TA3N.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي الوسائط المتعددة

Temporal Attentive Alignment for Large-Scale Video Domain Adaptation

404 - Min-Hung Chen , Zsolt Kira , Ghassan AlRegib 2019

Although various image-based domain adaptation (DA) techniques have been proposed in recent years, domain shift in videos is still not well-explored. Most previous works only evaluate performance on small-scale datasets which are saturated. Therefore , we first propose two large-scale video DA datasets with much larger domain discrepancy: UCF-HMDB_full and Kinetics-Gameplay. Second, we investigate different DA integration methods for videos, and show that simultaneously aligning and learning temporal dynamics achieves effective alignment even without sophisticated DA methods. Finally, we propose Temporal Attentive Adversarial Adaptation Network (TA3N), which explicitly attends to the temporal dynamics using domain discrepancy for more effective domain alignment, achieving state-of-the-art performance on four video DA datasets (e.g. 7.9% accuracy gain over Source only from 73.9% to 81.8% on HMDB --> UCF, and 10.3% gain on Kinetics --> Gameplay). The code and data are released at http://github.com/cmhungsteve/TA3N.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي الوسائط المتعددة

Partial Domain Adaptation without Domain Alignment

135 - Weikai Li , Songcan Chen 2021

Unsupervised domain adaptation (UDA) aims to transfer knowledge from a well-labeled source domain to a different but related unlabeled target domain with identical label space. Currently, the main workhorse for solving UDA is domain alignment, which has proven successful. However, it is often difficult to find an appropriate source domain with identical label space. A more practical scenario is so-called partial domain adaptation (PDA) in which the source label set or space subsumes the target one. Unfortunately, in PDA, due to the existence of the irrelevant categories in the source domain, it is quite hard to obtain a perfect alignment, thus resulting in mode collapse and negative transfer. Although several efforts have been made by down-weighting the irrelevant source categories, the strategies used tend to be burdensome and risky since exactly which irrelevant categories are unknown. These challenges motivate us to find a relatively simpler alternative to solve PDA. To achieve this, we first provide a thorough theoretical analysis, which illustrates that the target risk is bounded by both model smoothness and between-domain discrepancy. Considering the difficulty of perfect alignment in solving PDA, we turn to focus on the model smoothness while discard the riskier domain alignment to enhance the adaptability of the model. Specifically, we instantiate the model smoothness as a quite simple intra-domain structure preserving (IDSP). To our best knowledge, this is the first naive attempt to address the PDA without domain alignment. Finally, our empirical results on multiple benchmark datasets demonstrate that IDSP is not only superior to the PDA SOTAs by a significant margin on some benchmarks (e.g., +10% on Cl->Rw and +8% on Ar->Rw ), but also complementary to domain alignment in the standard UDA

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Adversarial Consistent Learning on Partial Domain Adaptation of PlantCLEF 2020 Challenge

183 - Youshan Zhang , Brian D. Davison 2020

Domain adaptation is one of the most crucial techniques to mitigate the domain shift problem, which exists when transferring knowledge from an abundant labeled sourced domain to a target domain with few or no labels. Partial domain adaptation address es the scenario when target categories are only a subset of source categories. In this paper, to enable the efficient representation of cross-domain plant images, we first extract deep features from pre-trained models and then develop adversarial consistent learning ($ACL$) in a unified deep architecture for partial domain adaptation. It consists of source domain classification loss, adversarial learning loss, and feature consistency loss. Adversarial learning loss can maintain domain-invariant features between the source and target domains. Moreover, feature consistency loss can preserve the fine-grained feature transition between two domains. We also find the shared categories of two domains via down-weighting the irrelevant categories in the source domain. Experimental results demonstrate that training features from NASNetLarge model with proposed $ACL$ architecture yields promising results on the PlantCLEF 2020 Challenge.

الرؤية الحاسوبية وتمييز الأنماط

Channel-Temporal Attention for First-Person Video Domain Adaptation

143 - Xianyuan Liu , Shuo Zhou , Tao Lei 2021

Unsupervised Domain Adaptation (UDA) can transfer knowledge from labeled source data to unlabeled target data of the same categories. However, UDA for first-person action recognition is an under-explored problem, with lack of datasets and limited con sideration of first-person video characteristics. This paper focuses on addressing this problem. Firstly, we propose two small-scale first-person video domain adaptation datasets: ADL$_{small}$ and GTEA-KITCHEN. Secondly, we introduce channel-temporal attention blocks to capture the channel-wise and temporal-wise relationships and model their inter-dependencies important to first-person vision. Finally, we propose a Channel-Temporal Attention Network (CTAN) to integrate these blocks into existing architectures. CTAN outperforms baselines on the two proposed datasets and one existing dataset EPIC$_{cvpr20}$.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي