New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Empirical Evaluation of Pre-trained Transformers for Human-Level NLP: The Role of Sample Size and Dimensionality

التقييم التجريبي للمحولات المدربين مسبقا ل NLP على المستوى البشري: دور حجم العينة والأبعاد

540 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In human-level NLP tasks, such as predicting mental health, personality, or demographics, the number of observations is often smaller than the standard 768+ hidden state sizes of each layer within modern transformer-based language models, limiting the ability to effectively leverage transformers. Here, we provide a systematic study on the role of dimension reduction methods (principal components analysis, factorization techniques, or multi-layer auto-encoders) as well as the dimensionality of embedding vectors and sample sizes as a function of predictive performance. We first find that fine-tuning large models with a limited amount of data pose a significant difficulty which can be overcome with a pre-trained dimension reduction regime. RoBERTa consistently achieves top performance in human-level tasks, with PCA giving benefit over other reduction methods in better handling users that write longer texts. Finally, we observe that a majority of the tasks achieve results comparable to the best performance with just 1/12 of the embedding dimensions.

References used

https://aclanthology.org/

rate research

The Great Misalignment Problem in Human Evaluation of NLP Methods

330 - Association for Computation Linguistics 2021 مقالة

We outline the Great Misalignment Problem in natural language processing research, this means simply that the problem definition is not in line with the method proposed and the human evaluation is not in line with the definition nor the method. We st udy this misalignment problem by surveying 10 randomly sampled papers published in ACL 2020 that report results with human evaluation. Our results show that only one paper was fully in line in terms of problem definition, method and evaluation. Only two papers presented a human evaluation that was in line with what was modeled in the method. These results highlight that the Great Misalignment Problem is a major one and it affects the validity and reproducibility of results obtained by a human evaluation.

great misalignment problem great misalignment misalignment problem مشكلة اختلال كبيرة اختلال كبير مشكلة اختلال صناعة حمض الفوسفور المزيد..

Word Representation Learning in Multimodal Pre-Trained Transformers: An Intrinsic Evaluation

291 - Association for Computation Linguistics 2021 مقالة

Abstract This study carries out a systematic intrinsic evaluation of the semantic representations learned by state-of-the-art pre-trained multimodal Transformers. These representations are claimed to be task-agnostic and shown to help on many downstr eam language-and-vision tasks. However, the extent to which they align with human semantic intuitions remains unclear. We experiment with various models and obtain static word representations from the contextualized ones they learn. We then evaluate them against the semantic judgments provided by human speakers. In line with previous evidence, we observe a generalized advantage of multimodal representations over language- only ones on concrete word pairs, but not on abstract ones. On the one hand, this confirms the effectiveness of these models to align language and vision, which results in better semantic representations for concepts that are grounded in images. On the other hand, models are shown to follow different representation learning patterns, which sheds some light on how and when they perform multimodal integration.

انخفاض المعجمات systematic intrinsic evaluation multimodal pre-trained transformers التقييم الجوهري المنهجي محولات متعددة المتدرب مسبقا صناعة حمض الفوسفور

The effect of sample size on the statistical test power

2148 - Aِl-Baath University 2017 ورقة بحثية

The research aims to estimate the effect of sample size on the statistical test power (t) for one sample, two interrelated samples, two independent samples, and on the statistical test power of one-way analysis of variance test (F) to compare the averages. The descriptive method was used, and different sizes of samples (300) items, where it was generated using the program (PASS 14), and taken into account to be realized in this data the set of assumptions needed to make test (t) and (F), with respect to random testing, categorical level of measurement, normal distribution, and equinoctial variance.

حجم العينة قوة الاختبار الإحصائي sample size the statistical test power

A Review of Human Evaluation for Style Transfer

312 - Association for Computation Linguistics 2021 مقالة

This paper reviews and summarizes human evaluation practices described in 97 style transfer papers with respect to three main evaluation aspects: style transfer, meaning preservation, and fluency. In principle, evaluations by human raters should be t he most reliable. However, in style transfer papers, we find that protocols for human evaluations are often underspecified and not standardized, which hampers the reproducibility of research in this field and progress toward better human and automatic evaluation methods.

استجابة شخصية style transfer papers أوراق نقل النمط صناعة حمض الفوسفور

Comparative Analysis of Formulas Used to Calculate the Size of the Random Sample

1769 - Tishreen University 2014 ورقة بحثية

The research aims to develop some formulas of sample size and characterization and comparison among themselves to determine the best formula of formulas to calculate the sample size and the creation of a modified reflected well on the sample size, in addition to specifying individual gratification I and II for the relevant formulas and mathematical equations can predict the sample size, however the size of the community. The researcher through the study the following results: The results were identical to the formula related to the size and the sample size when consolidation requirements. Sample size did not increase with increasing size of the moral community at first gratification. No moral differences between sample volume according to the size of the community when individual gratification. Moral differences exist between sample size and average total inspection according to the size of the community when individual gratification. We got a mathematical models of the relationship between size and the sample size and the size of the community and the average total inspection. We have developed a comprehensive table gives sample size corresponding to the size of the community can be accessible to researchers to take advantage of it and apply the formulas as long as it originally relied upon certain conditions.

Population Size Random Sample Size Formulas of Computing of Random Sample Size Gratification حجم المجتمع حجم العينة العشوائية صيغ حساب حجم العينة حد الاشباع المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Empirical Evaluation of Pre-trained Transformers for Human-Level NLP: The Role of Sample Size and Dimensionality

التقييم التجريبي للمحولات المدربين مسبقا ل NLP على المستوى البشري: دور حجم العينة والأبعاد

Ask ChatGPT about the research

Read More

suggested questions