Research papers, master and doctoral theses about مقيدة

RoR: Read-over-Read for Long Document Machine Reading Comprehension

185 - Association for Computation Linguistics 2021 مقالة

Transformer-based pre-trained models, such as BERT, have achieved remarkable results on machine reading comprehension. However, due to the constraint of encoding length (e.g., 512 WordPiece tokens), a long document is usually split into multiple chun ks that are independently read. It results in the reading field being limited to individual chunks without information collaboration for long document machine reading comprehension. To address this problem, we propose RoR, a read-over-read method, which expands the reading field from chunk to document. Specifically, RoR includes a chunk reader and a document reader. The former first predicts a set of regional answers for each chunk, which are then compacted into a highly-condensed version of the original document, guaranteeing to be encoded once. The latter further predicts the global answers from this condensed document. Eventually, a voting strategy is utilized to aggregate and rerank the regional and global answers for final prediction. Extensive experiments on two benchmarks QuAC and TriviaQA demonstrate the effectiveness of RoR for long document reading. Notably, RoR ranks 1st place on the QuAC leaderboard (https://quac.ai/) at the time of submission (May 17th, 2021).

إجابة سؤال مقيدة long document machine document machine reading آلة وثيقة طويلة آلة وثيقة القراءة صناعة حمض الفوسفور

Transformer-based Lexically Constrained Headline Generation

125 - Association for Computation Linguistics 2021 مقالة

This paper explores a variant of automatic headline generation methods, where a generated headline is required to include a given phrase such as a company or a product name. Previous methods using Transformer-based models generate a headline includin g a given phrase by providing the encoder with additional information corresponding to the given phrase. However, these methods cannot always include the phrase in the generated headline. Inspired by previous RNN-based methods generating token sequences in backward and forward directions from the given phrase, we propose a simple Transformer-based method that guarantees to include the given phrase in the high-quality generated headline. We also consider a new headline generation strategy that takes advantage of the controllable generation order of Transformer. Our experiments with the Japanese News Corpus demonstrate that our methods, which are guaranteed to include the phrase in the generated headline, achieve ROUGE scores comparable to previous Transformer-based methods. We also show that our generation strategy performs better than previous strategies.

transformer-based lexically constrained lexically constrained headline lexically constrained محول مقيد متعمول مقيد معروض إلى العنوان مقيدة متعمدة صناعة حمض الفوسفور المزيد..

Sesame Street to Mount Sinai: BERT-constrained character-level Moses models for multilingual lexical normalization

229 - Association for Computation Linguistics 2021 مقالة

This paper describes the HEL-LJU submissions to the MultiLexNorm shared task on multilingual lexical normalization. Our system is based on a BERT token classification preprocessing step, where for each token the type of the necessary transformation i s predicted (none, uppercase, lowercase, capitalize, modify), and a character-level SMT step where the text is translated from original to normalized given the BERT-predicted transformation constraints. For some languages, depending on the results on development data, the training data was extended by back-translating OpenSubtitles data. In the final ordering of the ten participating teams, the HEL-LJU team has taken the second place, scoring better than the previous state-of-the-art.

bert-constrained character-level moses multilingual lexical normalization character-level moses models بريه مقيدة مستوى الطابع موسى التطبيع المعجمي متعدد اللغات طرازات موسى مستوى الأحرف صناعة حمض الفوسفور المزيد..

Re-embedding Difficult Samples via Mutual Information Constrained Semantically Oversampling for Imbalanced Text Classification

457 - Association for Computation Linguistics 2021 مقالة

Difficult samples of the minority class in imbalanced text classification are usually hard to be classified as they are embedded into an overlapping semantic region with the majority class. In this paper, we propose a Mutual Information constrained S emantically Oversampling framework (MISO) that can generate anchor instances to help the backbone network determine the re-embedding position of a non-overlapping representation for each difficult sample. MISO consists of (1) a semantic fusion module that learns entangled semantics among difficult and majority samples with an adaptive multi-head attention mechanism, (2) a mutual information loss that forces our model to learn new representations of entangled semantics in the non-overlapping region of the minority class, and (3) a coupled adversarial encoder-decoder that fine-tunes disentangled semantic representations to remain their correlations with the minority class, and then using these disentangled semantic representations to generate anchor instances for each difficult sample. Experiments on a variety of imbalanced text classification tasks demonstrate that anchor instances help classifiers achieve significant improvements over strong baselines.

constrained semantically oversampling imbalanced text classification information constrained semantically مقيدة تصنيف النص غير الماني المعلومات المقيدة بشكل دلالي صناعة حمض الفوسفور المزيد..

A Pretraining Numerical Reasoning Model for Ordinal Constrained Question Answering on Knowledge Base

308 - Association for Computation Linguistics 2021 مقالة

Knowledge Base Question Answering (KBQA) is to answer natural language questions posed over knowledge bases (KBs). This paper targets at empowering the IR-based KBQA models with the ability of numerical reasoning for answering ordinal constrained que stions. A major challenge is the lack of explicit annotations about numerical properties. To address this challenge, we propose a pretraining numerical reasoning model consisting of NumGNN and NumTransformer, guided by explicit self-supervision signals. The two modules are pretrained to encode the magnitude and ordinal properties of numbers respectively and can serve as model-agnostic plugins for any IR-based KBQA model to enhance its numerical reasoning ability. Extensive experiments on two KBQA benchmarks verify the effectiveness of our method to enhance the numerical reasoning ability for IR-based KBQA models.

knowledge base question base question answering constrained question answering سؤال قاعدة المعرفة إجابة سؤال أساسي إجابة سؤال مقيدة صناعة حمض الفوسفور المزيد..

Learn to Copy from the Copying History: Correlational Copy Network for Abstractive Summarization

492 - Association for Computation Linguistics 2021 مقالة

The copying mechanism has had considerable success in abstractive summarization, facilitating models to directly copy words from the input text to the output summary. Existing works mostly employ encoder-decoder attention, which applies copying at ea ch time step independently of the former ones. However, this may sometimes lead to incomplete copying. In this paper, we propose a novel copying scheme named Correlational Copying Network (CoCoNet) that enhances the standard copying mechanism by keeping track of the copying history. It thereby takes advantage of prior copying distributions and, at each time step, explicitly encourages the model to copy the input word that is relevant to the previously copied one. In addition, we strengthen CoCoNet through pre-training with suitable corpora that simulate the copying behaviors. Experimental results show that CoCoNet can copy more accurately and achieves new state-of-the-art performances on summarization benchmarks, including CNN/DailyMail for news summarization and SAMSum for dialogue summarization. The code and checkpoint will be publicly available.

مقيدة متعمدة correlational copy network correlational copying network شبكة النسخ المروري شبكة النسخ المصقول صناعة حمض الفوسفور

DMix: Distance Constrained Interpolative Mixup

68 - Association for Computation Linguistics 2021 مقالة

Interpolation-based regularisation methods have proven to be effective for various tasks and modalities. Mixup is a data augmentation method that generates virtual training samples from convex combinations of individual inputs and labels. We extend M ixup and propose DMix, distance-constrained interpolative Mixup for sentence classification leveraging the hyperbolic space. DMix achieves state-of-the-art results on sentence classification over existing data augmentation methods across datasets in four languages.

distance constrained interpolative distance constrained constrained interpolative mixup المسافة المقيدة لابتثار المسافة مقيدة خلط الاستراني المقيد صناعة حمض الفوسفور المزيد..

NHK's Lexically-Constrained Neural Machine Translation at WAT 2021

396 - Association for Computation Linguistics 2021 مقالة

This paper describes the system of our team (NHK) for the WAT 2021 Japanese-English restricted machine translation task. In this task, the aim is to improve quality while maintaining consistent terminology for scientific paper translation. This task has a unique feature, where some words in a target sentence are given in addition to a source sentence. In this paper, we use a lexically-constrained neural machine translation (NMT), which concatenates the source sentence and constrained words with a special token to input them into the encoder of NMT. The key to the successful lexically-constrained NMT is the way to extract constraints from a target sentence of training data. We propose two extraction methods: proper-noun constraint and mistranslated-word constraint. These two methods consider the importance of words and fallibility of NMT, respectively. The evaluation results demonstrate the effectiveness of our lexical-constraint method.

نتائج الترجمة lexically-constrained neural machine آلة العصبية مقيدة معجمية صناعة حمض الفوسفور

A Hybrid Genetic-Continuation Algorithm for Solving Constrained Optimization Problems

1353 - Damascus University 2001 ورقة بحثية

we constructed a continuation predictor- corrector algorithm that solves constrained optimization problems. Smooth penalty functions combined with numerical continuation, along with the use of the expanded Lagrangian system, were essential compone nts of the algorithm. An improvement of this algorithm was published, which dealt with the linear algebra in the corrector part of the algorithm.

خوارزمية Algorithm أمثلة مقيدة توابع جزائية Constrained optimization problems penalty functions

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد