أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Fei Yang

A Simple and Effective Method To Eliminate the Self Language Bias in Multilingual Representations

98 - Ziyi Yang , Yinfei Yang , Daniel Cer 2021

Language agnostic and semantic-language information isolation is an emerging research direction for multilingual representations models. We explore this problem from a novel angle of geometric algebra and semantic space. A simple but highly effective method Language Information Removal (LIR) factors out language identity information from semantic related components in multilingual representations pre-trained on multi-monolingual data. A post-training and model-agnostic method, LIR only uses simple linear operations, e.g. matrix factorization and orthogonal projection. LIR reveals that for weak-alignment multilingual systems, the principal components of semantic spaces primarily encodes language identity information. We first evaluate the LIR on a cross-lingual question answer retrieval task (LAReQA), which requires the strong alignment for the multilingual embedding space. Experiment shows that LIR is highly effectively on this task, yielding almost 100% relative improvement in MAP for weak-alignment models. We then evaluate the LIR on Amazon Reviews and XEVAL dataset, with the observation that removing language information is able to improve the cross-lingual transfer performance.

الحساب واللغة الذكاء الاصطناعي

Programmed Wrapping and Assembly of Droplets with Mesoscale Polymers

138 - Dylan M. Barber , Zhefei Yang , Lucas Prevost 2021

Nature is remarkably adept at using interfaces to build structures, encapsulate reagents, and regulate biological processes. Inspired by Nature, we describe flexible polymer-based ribbons, termed mesoscale polymers (MSPs), to modulate interfacial int eractions with liquid droplets. This produces unprecedented hybrid assemblies in the forms of flagellum-like structures and MSP-wrapped droplets. Successful preparation of these hybrid structures hinges on interfacial interactions and tailored MSP compositions, such as MSPs with domains possessing distinctly different affinity for fluid-fluid interfaces as well as mechanical properties. In situ measurements of MSP-droplet interactions confirm that MSPs possess a negligible bending stiffness, allowing interfacial energy to drive mesoscale assembly. By exploiting these interfacial driving forces, mesoscale polymers are demonstrated as a powerful platform that underpins the preparation of sophisticated hybrid structures in fluids.

مادة مكثفة ناعمة علم المواد

3D Shapes Local Geometry Codes Learning with SDF

94 - Shun Yao , Fei Yang , Yongmei Cheng 2021

A signed distance function (SDF) as the 3D shape description is one of the most effective approaches to represent 3D geometry for rendering and reconstruction. Our work is inspired by the state-of-the-art method DeepSDF that learns and analyzes the 3 D shape as the iso-surface of its shell and this method has shown promising results especially in the 3D shape reconstruction and compression domain. In this paper, we consider the degeneration problem of reconstruction coming from the capacity decrease of the DeepSDF model, which approximates the SDF with a neural network and a single latent code. We propose Local Geometry Code Learning (LGCL), a model that improves the original DeepSDF results by learning from a local shape geometry of the full 3D shape. We add an extra graph neural network to split the single transmittable latent code into a set of local latent codes distributed on the 3D shape. Mentioned latent codes are used to approximate the SDF in their local regions, which will alleviate the complexity of the approximation compared to the original DeepSDF. Furthermore, we introduce a new geometric loss function to facilitate the training of these local latent codes. Note that other local shape adjusting methods use the 3D voxel representation, which in turn is a problem highly difficult to solve or even is insolvable. In contrast, our architecture is based on graph processing implicitly and performs the learning regression process directly in the latent code space, thus make the proposed architecture more flexible and also simple for realization. Our experiments on 3D shape reconstruction demonstrate that our LGCL method can keep more details with a significantly smaller size of the SDF decoder and outperforms considerably the original DeepSDF method under the most important quantitative metrics.

الرؤية الحاسوبية وتمييز الأنماط

The Endokernel: Fast, Secure, and Programmable Subprocess Virtualization

109 - Bumjin Im 2021

Commodity applications contain more and more combinations of interacting components (user, application, library, and system) and exhibit increasingly diverse tradeoffs between isolation, performance, and programmability. We argue that the challenge o f future runtime isolation is best met by embracing the multi-principle nature of applications, rethinking process architecture for fast and extensible intra-process isolation. We present, the Endokernel, a new process model and security architecture that nests an extensible monitor into the standard process for building efficient least-authority abstractions. The Endokernel introduces a new virtual machine abstraction for representing subprocess authority, which is enforced by an efficient self-isolating monitor that maps the abstraction to system level objects (processes, threads, files, and signals). We show how the Endokernel can be used to develop specialized separation abstractions using an exokernel-like organization to provide virtual privilege rings, which we use to reorganize and secure NGINX. Our prototype, includes a new syscall monitor, the nexpoline, and explores the tradeoffs of implementing it with diverse mechanisms, including Intel Control Enhancement Technology. Overall, we believe sub-process isolation is a must and that the Endokernel exposes an essential set of abstractions for realizing this in a simple and feasible way.

التشفير والأمن

Partial Video Domain Adaptation with Partial Adversarial Temporal Attentive Network

91 - Yuecong Xu , Jianfei Yang , Haozhi Cao 2021

Partial Domain Adaptation (PDA) is a practical and general domain adaptation scenario, which relaxes the fully shared label space assumption such that the source label space subsumes the target one. The key challenge of PDA is the issue of negative t ransfer caused by source-only classes. For videos, such negative transfer could be triggered by both spatial and temporal features, which leads to a more challenging Partial Video Domain Adaptation (PVDA) problem. In this paper, we propose a novel Partial Adversarial Temporal Attentive Network (PATAN) to address the PVDA problem by utilizing both spatial and temporal features for filtering source-only classes. Besides, PATAN constructs effective overall temporal features by attending to local temporal features that contribute more toward the class filtration process. We further introduce new benchmarks to facilitate research on PVDA problems, covering a wide range of PVDA scenarios. Empirical results demonstrate the state-of-the-art performance of our proposed PATAN across the multiple PVDA benchmarks.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

Aligning Correlation Information for Domain Adaptation in Action Recognition

126 - Yuecong Xu , Jianfei Yang , Haozhi Cao 2021

Domain adaptation (DA) approaches address domain shift and enable networks to be applied to different scenarios. Although various image DA approaches have been proposed in recent years, there is limited research towards video DA. This is partly due t o the complexity in adapting the different modalities of features in videos, which includes the correlation features extracted as long-term dependencies of pixels across spatiotemporal dimensions. The correlation features are highly associated with action classes and proven their effectiveness in accurate video feature extraction through the supervised action recognition task. Yet correlation features of the same action would differ across domains due to domain shift. Therefore we propose a novel Adversarial Correlation Adaptation Network (ACAN) to align action videos by aligning pixel correlations. ACAN aims to minimize the distribution of correlation information, termed as Pixel Correlation Discrepancy (PCD). Additionally, video DA research is also limited by the lack of cross-domain video datasets with larger domain shifts. We, therefore, introduce a novel HMDB-ARID dataset with a larger domain shift caused by a larger statistical difference between domains. This dataset is built in an effort to leverage current datasets for dark video classification. Empirical results demonstrate the state-of-the-art performance of our proposed ACAN for both existing and the new video DA datasets.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي

Exponential Approximation of Band-limited Signals from Nonuniform Sampling

113 - Yunfei Yang , Haizhang Zhang 2021

Reconstructing a band-limited function from its finite sample data is a fundamental task in signal analysis. A simple Gaussian or hyper-Gaussian regularized Shannon sampling series has been proved to be able to achieve exponential convergence for uni form sampling. In this paper, we prove that exponential approximation can also be attained for general nonuniform sampling. The analysis is based on the the residue theorem to represent the truncated error by a contour integral. Several concrete examples of nonuniform sampling with exponential convergence will be presented.

معالجة الإشارات نظرية المعلومات نظرية المعلومات

Local connectivity of Julia sets for rational maps with Siegel disks

85 - Shuyi Wang , Fei Yang , Gaofei Zhang 2021

We prove that a long iteration of rational maps is expansive near boundaries of bounded type Siegel disks. This leads us to extend Petersens local connectivity result on the Julia sets of quadratic Siegel polynomials to a general case.

النظم الديناميكية المتغيرات المعقدة

Ensemble Defense with Data Diversity: Weak Correlation Implies Strong Robustness

92 - Renjue Li , Hanwei Zhang , Pengfei Yang 2021

In this paper, we propose a framework of filter-based ensemble of deep neuralnetworks (DNNs) to defend against adversarial attacks. The framework builds an ensemble of sub-models -- DNNs with differentiated preprocessing filters. From the theoretical perspective of DNN robustness, we argue that under the assumption of high quality of the filters, the weaker the correlations of the sensitivity of the filters are, the more robust the ensemble model tends to be, and this is corroborated by the experiments of transfer-based attacks. Correspondingly, we propose a principle that chooses the specific filters with smaller Pearson correlation coefficients, which ensures the diversity of the inputs received by DNNs, as well as the effectiveness of the entire framework against attacks. Our ensemble models are more robust than those constructed by previous defense methods like adversarial training, and even competitive with the classical ensemble of adversarial trained DNNs under adversarial attacks when the attacking radius is large.

التعلم الآلي

Pathdreamer: A World Model for Indoor Navigation

382 - Jing Yu Koh , Honglak Lee , Yinfei Yang 2021

People navigating in unfamiliar buildings take advantage of myriad visual, spatial and semantic cues to efficiently achieve their navigation goals. Towards equipping computational agents with similar capabilities, we introduce Pathdreamer, a visual w orld model for agents navigating in novel indoor environments. Given one or more previous visual observations, Pathdreamer generates plausible high-resolution 360 visual observations (RGB, semantic segmentation and depth) for viewpoints that have not been visited, in buildings not seen during training. In regions of high uncertainty (e.g. predicting around corners, imagining the contents of an unseen room), Pathdreamer can predict diverse scenes, allowing an agent to sample multiple realistic outcomes for a given trajectory. We demonstrate that Pathdreamer encodes useful and accessible visual, spatial and semantic knowledge about human environments by using it in the downstream task of Vision-and-Language Navigation (VLN). Specifically, we show that planning ahead with Pathdreamer brings about half the benefit of looking ahead at actual observations from unobserved parts of the environment. We hope that Pathdreamer will help unlock model-based approaches to challenging embodied navigation tasks such as navigating to specified objects and VLN.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد