نحن غالبا ما نستخدم الاضطرابات لتنظيم النماذج العصبية.بالنسبة للكشف عن المشفر العصبي، طبقت الدراسات السابقة أخذ العينات المجدولة (بنغيو وآخرون.، 2015) والاضطرابات الخصومة (SATO et al.، 2019) كشراءات ولكن هذه الطرق تتطلب وقتا حسابيا كبيرا.وبالتالي، فإن هذه الدراسة تعالج مسألة ما إذا كانت هذه الأساليب فعالة بما يكفي لتدريب الوقت.قارنا العديد من الاضطرابات في مشاكل التسلسل إلى التسلسل فيما يتعلق بالوقت الحاسوبية.تظهر النتائج التجريبية أن التقنيات البسيطة مثل Hold Dropout (GAL و GHAHRAMANI، 2016) واستبدال عشوائي من الرموز المدخلات يحققون درجات قابلة للمقارنة (أو أفضل) إلى الاضطرابات المقترحة مؤخرا، على الرغم من أن هذه الطرق البسيطة أسرع.
We often use perturbations to regularize neural models. For neural encoder-decoders, previous studies applied the scheduled sampling (Bengio et al., 2015) and adversarial perturbations (Sato et al., 2019) as perturbations but these methods require considerable computational time. Thus, this study addresses the question of whether these approaches are efficient enough for training time. We compare several perturbations in sequence-to-sequence problems with respect to computational time. Experimental results show that the simple techniques such as word dropout (Gal and Ghahramani, 2016) and random replacement of input tokens achieve comparable (or better) scores to the recently proposed perturbations, even though these simple methods are faster.
References used
https://aclanthology.org/
thinking fast and slow, thinking fast and slow, thinking fast and slow, thinking fast and slow, thinking fast and slow, thinking fast and slow, thinking fast and slow, thinking fast and slow, thinking fast and slow, thinking fast and slow, thinking f
With the widespread of new fast networks and need for critical application, survivability,
reliability and quality of service became an sensational issue. Recovery mechanism used by IP
network spent a lot of time from several seconds to minutes. Th
Encoder-decoder models have been commonly used for many tasks such as machine translation and response generation. As previous research reported, these models suffer from generating redundant repetition. In this research, we propose a new mechanism f
We introduce Dynabench, an open-source platform for dynamic dataset creation and model benchmarking. Dynabench runs in a web browser and supports human-and-model-in-the-loop dataset creation: annotators seek to create examples that a target model wil
Transformer is an attention-based neural network, which consists of two sublayers, namely, Self-Attention Network (SAN) and Feed-Forward Network (FFN). Existing research explores to enhance the two sublayers separately to improve the capability of Tr