New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Making Heads and Tails of Models with Marginal Calibration for Sparse Tagsets

صنع رؤوس وذيول النماذج مع المعايرة الهامشية للأشرطة المتناقضة

69 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

heads and tails making heads sparse tagsets رؤساء والذيول صنع رؤساء الظاهرات المتناقضة صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

For interpreting the behavior of a probabilistic model, it is useful to measure a model's calibration---the extent to which it produces reliable confidence scores. We address the open problem of calibration for tagging models with sparse tagsets, and recommend strategies to measure and reduce calibration error (CE) in such models. We show that several post-hoc recalibration techniques all reduce calibration error across the marginal distribution for two existing sequence taggers. Moreover, we propose tag frequency grouping (TFG) as a way to measure calibration error in different frequency bands. Further, recalibrating each group separately promotes a more equitable reduction of calibration error across the tag frequency spectrum.

References used

https://aclanthology.org/

rate research

Exploring Strategies for Generalizable Commonsense Reasoning with Pre-trained Models

524 - Association for Computation Linguistics 2021 مقالة

Commonsense reasoning benchmarks have been largely solved by fine-tuning language models. The downside is that fine-tuning may cause models to overfit to task-specific data and thereby forget their knowledge gained during pre-training. Recent works o nly propose lightweight model updates as models may already possess useful knowledge from past experience, but a challenge remains in understanding what parts and to what extent models should be refined for a given task. In this paper, we investigate what models learn from commonsense reasoning datasets. We measure the impact of three different adaptation methods on the generalization and accuracy of models. Our experiments with two models show that fine-tuning performs best, by learning both the content and the structure of the task, but suffers from overfitting and limited generalization to novel answers. We observe that alternative adaptation methods like prefix-tuning have comparable accuracy, but generalize better to unseen answers and are more robust to adversarial splits.

generalizable commonsense reasoning strategies for generalizable exploring strategies منطق العموم المتعميم استراتيجيات القابلة للتعميم استكشاف الاستراتيجيات صناعة حمض الفوسفور المزيد..

On the Effects of Transformer Size on In- and Out-of-Domain Calibration

90 - Association for Computation Linguistics 2021 مقالة

Large, pre-trained transformer language models, which are pervasive in natural language processing tasks, are notoriously expensive to train. To reduce the cost of training such large models, prior work has developed smaller, more compact models whic h achieves a significant speedup in training time while maintaining competitive accuracy to the original model on downstream tasks. Though these smaller pre-trained models have been widely adopted by the community, it is not known how well are they calibrated compared to their larger counterparts. In this paper, focusing on a wide range of tasks, we thoroughly investigate the calibration properties of pre-trained transformers, as a function of their size. We demonstrate that when evaluated in-domain, smaller models are able to achieve competitive, and often better, calibration compared to larger models, while achieving significant speedup in training time. Post-hoc calibration techniques further reduce calibration error for all models in-domain. However, when evaluated out-of-domain, larger models tend to be better calibrated, and label-smoothing instead is an effective strategy to calibrate models in this setting.

effects of transformer آثار المحولات صناعة حمض الفوسفور

Challenging distributional models with a conceptual network of philosophical terms

104 - Association for Computation Linguistics 2021 مقالة

Computational linguistic research on language change through distributional semantic (DS) models has inspired researchers from fields such as philosophy and literary studies, who use these methods for the exploration and comparison of comparatively s mall datasets traditionally analyzed by close reading. Research on methods for small data is still in early stages and it is not clear which methods achieve the best results. We investigate the possibilities and limitations of using distributional semantic models for analyzing philosophical data by means of a realistic use-case. We provide a ground truth for evaluation created by philosophy experts and a blueprint for using DS models in a sound methodological setup. We compare three methods for creating specialized models from small datasets. Though the models do not perform well enough to directly support philosophers yet, we find that models designed for small data yield promising directions for future work.

challenging distributional models conceptual network challenging distributional نماذج التوزيع الصعبة الشبكة المفاهيمية التوزيع الصعب صناعة حمض الفوسفور المزيد..

Decontextualization: Making Sentences Stand-Alone

499 - Association for Computation Linguistics 2021 مقالة

Abstract Models for question answering, dialogue agents, and summarization often interpret the meaning of a sentence in a rich context and use that meaning in a new context. Taking excerpts of text can be problematic, as key pieces may not be explici t in a local window. We isolate and define the problem of sentence decontextualization: taking a sentence together with its context and rewriting it to be interpretable out of context, while preserving its meaning. We describe an annotation procedure, collect data on the Wikipedia corpus, and use the data to train models to automatically decontextualize sentences. We present preliminary studies that show the value of sentence decontextualization in a user-facing task, and as preprocessing for systems that perform document understanding. We argue that decontextualization is an important subtask in many downstream applications, and that the definitions and resources provided can benefit tasks that operate on sentences that occur in a richer context.

making sentences stand-alone making sentences making جعل الجمل قائمة بذاتها صنع الجمل تحضير صناعة حمض الفوسفور المزيد..

Quality and competence Assurance of testing and calibration laboratories according to the international standards

1408 - Aِl-Baath University 2017 ورقة بحثية

The ISO/IEC17025 International Standard for Quality and competence Assurance for ISO/IEC Test and Calibration Laboratories have been previously known as the ISO Guide 25, but the current standard is ISO /IEC 17025: 2005.

Quality Competence ضمان جودة المخابر كفاءة مختبرات الفحص و المعايرة testing and calibration laboratories

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Making Heads and Tails of Models with Marginal Calibration for Sparse Tagsets

صنع رؤوس وذيول النماذج مع المعايرة الهامشية للأشرطة المتناقضة

Ask ChatGPT about the research

Read More

suggested questions