ترغب بنشر مسار تعليمي؟ اضغط هنا

This paper considers the unsupervised domain adaptation problem for neural machine translation (NMT), where we assume the access to only monolingual text in either the source or target language in the new domain. We propose a cross-lingual data selec tion method to extract in-domain sentences in the missing language side from a large generic monolingual corpus. Our proposed method trains an adaptive layer on top of multilingual BERT by contrastive learning to align the representation between the source and target language. This then enables the transferability of the domain classifier between the languages in a zero-shot manner. Once the in-domain data is detected by the classifier, the NMT model is then adapted to the new domain by jointly learning translation and domain discrimination tasks. We evaluate our cross-lingual data selection method on NMT across five diverse domains in three language pairs, as well as a real-world scenario of translation for COVID-19. The results show that our proposed method outperforms other selection baselines up to +1.5 BLEU score.
Advertising is critical to many online e-commerce platforms such as e-Bay and Amazon. One of the important signals that these platforms rely upon is the click-through rate (CTR) prediction. The recent popularity of multi-modal sharing platforms such as TikTok has led to an increased interest in online micro-videos. It is, therefore, useful to consider micro-videos to help a merchant target micro-video advertising better and find users favourites to enhance user experience. Existing works on CTR prediction largely exploit unimodal content to learn item representations. A relatively minimal effort has been made to leverage multi-modal information exchange among users and items. We propose a model to exploit the temporal user-item interactions to guide the representation learning with multi-modal features, and further predict the user click rate of the micro-video item. We design a Hypergraph Click-Through Rate prediction framework (HyperCTR) built upon the hyperedge notion of hypergraph neural networks, which can yield modal-specific representations of users and micro-videos to better capture user preferences. We construct a time-aware user-item bipartite network with multi-modal information and enrich the representation of each user and item with the generated interests-based user hypergraph and item hypergraph. Through extensive experiments on three public datasets, we demonstrate that our proposed model significantly outperforms various state-of-the-art methods.
Hashtag, a product of user tagging behavior, which can well describe the semantics of the user-generated content personally over social network applications, e.g., the recently popular micro-videos. Hashtags have been widely used to facilitate variou s micro-video retrieval scenarios, such as search engine and categorization. In order to leverage hashtags on micro-media platform for effective e-commerce marketing campaign, there is a demand from e-commerce industry to develop a mapping algorithm bridging its categories and micro-video hashtags. In this demo paper, we therefore proposed a novel solution called TagPick that incorporates clues from all user behavior metadata (hashtags, interactions, multimedia information) as well as relational data (graph-based network) into a unified system to reveal the correlation between e-commerce categories and hashtags in industrial scenarios. In particular, we provide a tag-level popularity strategy to recommend the relevant hashtags for e-Commerce platform (e.g., eBay).
Machine-learning-as-a-service (MLaaS) has attracted millions of users to their outperforming sophisticated models. Although published as black-box APIs, the valuable models behind these services are still vulnerable to imitation attacks. Recently, a series of works have demonstrated that attackers manage to steal or extract the victim models. Nonetheless, none of the previous stolen models can outperform the original black-box APIs. In this work, we take the first step of showing that attackers could potentially surpass victims via unsupervised domain adaptation and multi-victim ensemble. Extensive experiments on benchmark datasets and real-world APIs validate that the imitators can succeed in outperforming the original black-box models. We consider this as a milestone in the research of imitation attack, especially on NLP APIs, as the superior performance could influence the defense or even publishing strategy of API providers.
186 - Yiyuan Zhang , Guangfu Cao , Li He 2021
In this paper, we investigate the boundedness of Toeplitz product $T_{f}T_{g}$ and Hankel product $H_{f}^{*} H_{g}$ on Fock-Sobolev space for two polynomials $f$ and $g$ in $z,overline{z}inmathbb{C}^{n}$. As a result, the boundedness of Toeplitz oper ator $T_{f}$ and Hankel operator $H_{f}$ with the polynomial symbol $f$ in $z,overline{z}inmathbb{C}^{n}$ is characterized.
182 - Lili He , Hans Lindblad 2021
In this work we give a complete picture of how to in a direct simple way define the mass at null infinity in harmonic coordinates in three different ways that we show satisfy the Bondi mass loss law. The first and second way involve only the limit of metric (Trautman mass) respectively the null second fundamental forms along asymptotically characteristic surfaces (asymptotic Hawking mass) that only depend on the ADM mass. The last in an original way involves construction of special characteristic coordinates at null infinity (Bondi mass). The results here rely on asymptotics of the metric derived in [24].
Semi-Supervised Learning (SSL) has seen success in many application domains, but this success often hinges on the availability of task-specific unlabeled data. Knowledge distillation (KD) has enabled compressing deep networks and ensembles, achieving the best results when distilling knowledge on fresh task-specific unlabeled examples. However, task-specific unlabeled data can be challenging to find. We present a general framework called generate, annotate, and learn (GAL) that uses unconditional generative models to synthesize in-domain unlabeled data, helping advance SSL and KD on different tasks. To obtain strong task-specific generative models, we adopt generic generative models, pretrained on open-domain data, and fine-tune them on inputs from specific tasks. Then, we use existing classifiers to annotate generated unlabeled examples with soft pseudo labels, which are used for additional training. When self-training is combined with samples generated from GPT2-large, fine-tuned on the inputs of each GLUE task, we outperform a strong RoBERTa-large baseline on the GLUE benchmark. Moreover, KD on GPT-2 samples yields a new state-of-the-art for 6-layer transformers on the GLUE leaderboard. Finally, self-training with GAL offers significant gains on image classification on CIFAR-10 and four tabular tasks from the UCI repository
The advances in pre-trained models (e.g., BERT, XLNET and etc) have largely revolutionized the predictive performance of various modern natural language processing tasks. This allows corporations to provide machine learning as a service (MLaaS) by en capsulating fine-tuned BERT-based models as commercial APIs. However, previous works have discovered a series of vulnerabilities in BERT- based APIs. For example, BERT-based APIs are vulnerable to both model extraction attack and adversarial example transferrability attack. However, due to the high capacity of BERT-based APIs, the fine-tuned model is easy to be overlearned, what kind of information can be leaked from the extracted model remains unknown and is lacking. To bridge this gap, in this work, we first present an effective model extraction attack, where the adversary can practically steal a BERT-based API (the target/victim model) by only querying a limited number of queries. We further develop an effective attribute inference attack to expose the sensitive attribute of the training data used by the BERT-based APIs. Our extensive experiments on benchmark datasets under various realistic settings demonstrate the potential vulnerabilities of BERT-based APIs.
Controlling thermal transport at the nanoscale is vital for many applications. Previously, it has been shown that this control can be achieved with periodically nanostructured two-dimensional phononic crystals, for the case of suspended devices. Here we show that thermal conductance can also be controlled with three-dimensional phononic crystals, allowing the engineering of the thermal contact of more varied devices without the need of suspension in the future. We show experimental results measured at sub-Kelvin temperatures for two different period three-dimensional crystals, as well as for a bulk control structure. The results show that the conductance can be enhanced with the phononic crystal structures in our geometry. This result cannot be fully explained by the simplest theory taking into account the coherent modification of the phonon band structure, calculated with finite element method simulations.
Natural language processing (NLP) tasks, ranging from text classification to text generation, have been revolutionised by the pre-trained language models, such as BERT. This allows corporations to easily build powerful APIs by encapsulating fine-tune d BERT models for downstream tasks. However, when a fine-tuned BERT model is deployed as a service, it may suffer from different attacks launched by malicious users. In this work, we first present how an adversary can steal a BERT-based API service (the victim/target model) on multiple benchmark datasets with limited prior knowledge and queries. We further show that the extracted model can lead to highly transferable adversarial attacks against the victim model. Our studies indicate that the potential vulnerabilities of BERT-based API services still hold, even when there is an architectural mismatch between the victim model and the attack model. Finally, we investigate two defence strategies to protect the victim model and find that unless the performance of the victim model is sacrificed, both model ex-traction and adversarial transferability can effectively compromise the target models
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا