ترغب بنشر مسار تعليمي؟ اضغط هنا

Contrastive learning models have achieved great success in unsupervised visual representation learning, which maximize the similarities between feature representations of different views of the same image, while minimize the similarities between feat ure representations of views of different images. In text summarization, the output summary is a shorter form of the input document and they have similar meanings. In this paper, we propose a contrastive learning model for supervised abstractive text summarization, where we view a document, its gold summary and its model generated summaries as different views of the same mean representation and maximize the similarities between them during training. We improve over a strong sequence-to-sequence text generation model (i.e., BART) on three different summarization datasets. Human evaluation also shows that our model achieves better faithfulness ratings compared to its counterpart without contrastive objectives.
Recent progress of abstractive text summarization largely relies on large pre-trained sequence-to-sequence Transformer models, which are computationally expensive. This paper aims to distill these large models into smaller ones for faster inference a nd minimal performance loss. Pseudo-labeling based methods are popular in sequence-to-sequence model distillation. In this paper, we find simply manipulating attention temperatures in Transformers can make pseudo labels easier to learn for student models. Our experiments on three summarization datasets show our proposed method consistently improves over vanilla pseudo-labeling based methods. We also find that both the pseudo labels and summaries produced by our students are shorter and more abstractive. We will make our code and models publicly available.
Distributed renewable energy systems are now widely installed in many buildings, transforming the buildings into electricity prosumers. Existing studies have developed some advanced building side controls that enable renewable energy sharing and that aim to optimise building-cluster-level performance via regulating the energy storage charging/ discharging. However, the flexible demand shifting ability of electric vehicles is not considered in these building side controls. For instance, the electric vehicle charging will usually start once they are plugged into charging stations. But, in such charging period the renewable generation may be insufficient to cover the EV charging load, leading to grid electricity imports. Consequently, the building-cluster-level performance is not optimised. Therefore, this study proposes a coordinated control of building prosumers for improving the cluster-level performance, by making use of energy sharing and storage capability of electricity batteries in both buildings and EVs. An EV charging/discharging model is first developed. Then, based on the predicted future 24h electricity demand and renewable generation data, the coordinated control first considers the whole building cluster as one integrated building and optimises its operation as well as the EV charging/discharging using genetic algorithm. Next, the operation of individual buildings in the future 24h is coordinated using nonlinear programming. For validation, the developed control has been tested on a real building cluster in Ludvika, Sweden. The study results show that the developed control can increase the cluster-level daily renewable self-consumption rate by 19% and meanwhile reduce the daily electricity bills by 36% compared with the conventional controls.
As an important perceptual characteristic of the Human Visual System (HVS), the Just Noticeable Difference (JND) has been studied for decades with image/video processing (e.g., perceptual image/video coding). However, there is little exploration on t he existence of JND for AI, like Deep Machine Vision (DMV), although the DMV has made great strides in many machine vision tasks. In this paper, we take an initial attempt, and demonstrate that DMV does have the JND, termed as DMVJND. Besides, we propose a JND model for the classification task in DMV. It has been discovered that DMV can tolerate distorted images with average PSNR of only 9.56dB (the lower the better), by generating JND via unsupervised learning with our DMVJND-NET. In particular, a semantic-guided redundancy assessment strategy is designed to constrain the magnitude and spatial distribution of the JND. Experimental results on classification tasks demonstrate that we successfully find and model the JND for deep machine vision. Meanwhile, our DMV-JND paves a possible direction for DMV oriented image/video compression, watermarking, quality assessment, deep neural network security, and so on.
We propose a novel language-independent approach to improve the efficiency for Grammatical Error Correction (GEC) by dividing the task into two subtasks: Erroneous Span Detection (ESD) and Erroneous Span Correction (ESC). ESD identifies grammatically incorrect text spans with an efficient sequence tagging model. Then, ESC leverages a seq2seq model to take the sentence with annotated erroneous spans as input and only outputs the corrected text for these spans. Experiments show our approach performs comparably to conventional seq2seq approaches in both English and Chinese GEC benchmarks with less than 50% time cost for inference.
Abstractive document summarization is usually modeled as a sequence-to-sequence (Seq2Seq) learning problem. Unfortunately, training large Seq2Seq based summarization models on limited supervised summarization data is challenging. This paper presents three pre-training objectives which allow us to pre-train a Seq2Seq based abstractive summarization model on unlabeled text. The main idea is that, given an input text artificially constructed from a document, a model is pre-trained to reinstate the original document. These objectives include sentence reordering, next sentence generation, and masked document generation, which have close relations with the abstractive document summarization task. Experiments on two benchmark summarization datasets (i.e., CNN/DailyMail and New York Times) show that all three objectives can improve performance upon baselines. Compared to models pre-trained on large-scale data (more than 160GB), our method, with only 19GB text for pre-training, achieves comparable results, which demonstrates its effectiveness.
The robustness of deep learning models against adversarial attacks has received increasing attention in recent years. However, both deep learning and adversarial training rely on the availability of a large amount of labeled data and usually do not g eneralize well to new, unseen classes when only a few training samples are accessible. To address this problem, we explicitly introduce a new challenging problem -- how to learn a robust deep model with limited training samples per class, called defensive few-shot learning in this paper. Simply employing the existing adversarial training techniques in the literature cannot solve this problem. This is because few-shot learning needs to learn transferable knowledge from disjoint auxiliary data, and thus it is invalid to assume the sample-level distribution consistency between the training and test sets as commonly assumed in existing adversarial training techniques. In this paper, instead of assuming such a distribution consistency, we propose to make this assumption at a task-level in the episodic training paradigm in order to better transfer the defense knowledge. Furthermore, inside each task, we design a task-conditioned distribution constraint to narrow the distribution gap between clean and adversarial examples at a sample-level. These give rise to a novel mechanism called multi-level distribution based adversarial training (MDAT) for learning transferable adversarial defense. In addition, a unified $mathcal{F}_{beta}$ score is introduced to evaluate different defense methods under the same principle. Extensive experiments demonstrate that MDAT achieves higher effectiveness and robustness over existing alternatives in the few-shot case.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا