Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Frustratingly Easy Edit-based Linguistic Steganography with a Masked Language Model

إحباط إذخانيا لغوي سهلة تحرير مع نموذج لغة ملثم

523 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

frustratingly easy edit-based easy edit-based linguistic frustratingly easy محبط سهل التحرير سهل التحرير اللغوي من السهل المحبط صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

With advances in neural language models, the focus of linguistic steganography has shifted from edit-based approaches to generation-based ones. While the latter's payload capacity is impressive, generating genuine-looking texts remains challenging. In this paper, we revisit edit-based linguistic steganography, with the idea that a masked language model offers an off-the-shelf solution. The proposed method eliminates painstaking rule construction and has a high payload capacity for an edit-based model. It is also shown to be more secure against automatic detection than a generation-based method while offering better control of the security/payload capacity trade-off.

References used

https://aclanthology.org/

rate research

Ad Headline Generation using Self-Critical Masked Language Model

917 - Association for Computation Linguistics 2021 مقالة

For any E-commerce website it is a nontrivial problem to build enduring advertisements that attract shoppers. It is hard to pass the creative quality bar of the website, especially at a large scale. We thus propose a programmatic solution to generate product advertising headlines using retail content. We propose a state of the art application of Reinforcement Learning (RL) Policy gradient methods on Transformer (Vaswani et al., 2017) based Masked Language Models (Devlin et al., 2019). Our method creates the advertising headline by jointly conditioning on multiple products that a seller wishes to advertise. We demonstrate that our method outperforms existing Transformer and LSTM + RL methods in overlap metrics and quality audits. We also show that our model generated headlines outperform human submitted headlines in terms of both grammar and creative quality as determined by audits.

self-critical masked language generation using self-critical masked language لغة ملثم ذاتية جيل باستخدام الحرجة الذاتية لغة ملثمنة صناعة حمض الفوسفور المزيد..

Knowledge Enhanced Masked Language Model for Stance Detection

859 - Association for Computation Linguistics 2021 مقالة

Detecting stance on Twitter is especially challenging because of the short length of each tweet, the continuous coinage of new terminology and hashtags, and the deviation of sentence structure from standard prose. Fine-tuned language models using lar ge-scale in-domain data have been shown to be the new state-of-the-art for many NLP tasks, including stance detection. In this paper, we propose a novel BERT-based fine-tuning method that enhances the masked language model for stance detection. Instead of random token masking, we propose using a weighted log-odds-ratio to identify words with high stance distinguishability and then model an attention mechanism that focuses on these words. We show that our proposed approach outperforms the state of the art for stance detection on Twitter data about the 2020 US Presidential election.

knowledge enhanced masked enhanced masked language knowledge enhanced المعرفة المحسنة ملثمين تعزيز اللغة الملثمين المعرفة المحسنة صناعة حمض الفوسفور المزيد..

Paradigm Clustering with Weighted Edit Distance

750 - Association for Computation Linguistics 2021 مقالة

This paper describes our system for the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering, which asks participants to group inflected forms together according their underlying lemma without the aid of annotated training da ta. We employ agglomerative clustering to group word forms together using a metric that combines an orthographic distance and a semantic distance from word embeddings. We experiment with two variations of an edit distance-based model for quantifying orthographic distance, but, due to time constraints, our system does not improve over the shared task's baseline system.

نموذج مورفولوجي weighted edit distance تعديل المسافة المرجحة صناعة حمض الفوسفور

MG-BERT: Multi-Graph Augmented BERT for Masked Language Modeling

587 - Association for Computation Linguistics 2021 مقالة

Pre-trained models like Bidirectional Encoder Representations from Transformers (BERT), have recently made a big leap forward in Natural Language Processing (NLP) tasks. However, there are still some shortcomings in the Masked Language Modeling (MLM) task performed by these models. In this paper, we first introduce a multi-graph including different types of relations between words. Then, we propose Multi-Graph augmented BERT (MG-BERT) model that is based on BERT. MG-BERT embeds tokens while taking advantage of a static multi-graph containing global word co-occurrences in the text corpus beside global real-world facts about words in knowledge graphs. The proposed model also employs a dynamic sentence graph to capture local context effectively. Experimental results demonstrate that our model can considerably enhance the performance in the MLM task.

masked language modeling نمذجة لغة ملثكة نمذجة اللغة صناعة حمض الفوسفور

EDITOR: An Edit-Based Transformer with Repositioning for Neural Machine Translation with Soft Lexical Constraints

566 - Association for Computation Linguistics 2021 مقالة

Abstract We introduce an Edit-Based TransfOrmer with Repositioning (EDITOR), which makes sequence generation flexible by seamlessly allowing users to specify preferences in output lexical choice. Building on recent models for non-autoregressive seque nce generation (Gu et al., 2019), EDITOR generates new sequences by iteratively editing hypotheses. It relies on a novel reposition operation designed to disentangle lexical choice from word positioning decisions, while enabling efficient oracles for imitation learning and parallel edits at decoding time. Empirically, EDITOR uses soft lexical constraints more effectively than the Levenshtein Transformer (Gu et al., 2019) while speeding up decoding dramatically compared to constrained beam search (Post and Vilar, 2018). EDITOR also achieves comparable or better translation quality with faster decoding speed than the Levenshtein Transformer on standard Romanian-English, English-German, and English-Japanese machine translation tasks.

الإسناد الاجتماعي repositioning for neural soft lexical constraints إعادة وضع العصبي القيود المعجمية الناعمة صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Frustratingly Easy Edit-based Linguistic Steganography with a Masked Language Model

إحباط إذخانيا لغوي سهلة تحرير مع نموذج لغة ملثم

Ask ChatGPT about the research

Read More

suggested questions