New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Mind the Performance Gap: Examining Dataset Shift During Prospective Validation

105 0 0.0 ( 0 )

Download Cite

Added by Erkin Otles

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Erkin Otlec{s} - Jeeheh Oh - Benjamin Li

Computers and Society Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Once integrated into clinical care, patient risk stratification models may perform worse compared to their retrospective performance. To date, it is widely accepted that performance will degrade over time due to changes in care processes and patient populations. However, the extent to which this occurs is poorly understood, in part because few researchers report prospective validation performance. In this study, we compare the 2020-2021 (20-21) prospective performance of a patient risk stratification model for predicting healthcare-associated infections to a 2019-2020 (19-20) retrospective validation of the same model. We define the difference in retrospective and prospective performance as the performance gap. We estimate how i) temporal shift, i.e., changes in clinical workflows and patient populations, and ii) infrastructure shift, i.e., changes in access, extraction and transformation of data, both contribute to the performance gap. Applied prospectively to 26,864 hospital encounters during a twelve-month period from July 2020 to June 2021, the model achieved an area under the receiver operating characteristic curve (AUROC) of 0.767 (95% confidence interval (CI): 0.737, 0.801) and a Brier score of 0.189 (95% CI: 0.186, 0.191). Prospective performance decreased slightly compared to 19-20 retrospective performance, in which the model achieved an AUROC of 0.778 (95% CI: 0.744, 0.815) and a Brier score of 0.163 (95% CI: 0.161, 0.165). The resulting performance gap was primarily due to infrastructure shift and not temporal shift. So long as we continue to develop and validate models using data stored in large research data warehouses, we must consider differences in how and when data are accessed, measure how these differences may affect prospective performance, and work to mitigate those differences.

rate research

Impact of Load Demand Dataset Characteristics on Clustering Validation Indices

233 - Mayank Jain , Mukta Jain , Tarek AlSkaif 2021

With the inclusion of smart meters, electricity load consumption data can be fetched for individual consumer buildings at high temporal resolutions. Availability of such data has made it possible to study daily load demand profiles of the households. Clustering households based on their demand profiles is one of the primary, yet a key component of such analysis. While many clustering algorithms/frameworks can be deployed to perform clustering, they usually generate very different clusters. In order to identify the best clustering results, various cluster validation indices (CVIs) have been proposed in the literature. However, it has been noticed that different CVIs often recommend different algorithms. This leads to the problem of identifying the most suitable CVI for a given dataset. Responding to the problem, this paper shows how the recommendations of validation indices are influenced by different data characteristics that might be present in a typical residential load demand dataset. Furthermore, the paper identifies the features of data that prefer/prohibit the use of a particular cluster validation index.

Computers and Society

Robust Validation: Confident Predictions Even When Distributions Shift

564 - Maxime Cauchois , Suyash Gupta , Alnur Ali 2020

While the traditional viewpoint in machine learning and statistics assumes training and testing samples come from the same population, practice belies this fiction. One strategy---coming from robust statistics and optimization---is thus to build a model robust to distributional perturbations. In this paper, we take a different approach to describe procedures for robust predictive inference, where a model provides uncertainty estimates on its predictions rather than point predictions. We present a method that produces prediction sets (almost exactly) giving the right coverage level for any test distribution in an $f$-divergence ball around the training population. The method, based on conformal inference, achieves (nearly) valid coverage in finite samples, under only the condition that the training data be exchangeable. An essential component of our methodology is to estimate the amount of expected future data shift and build robustness to it; we develop estimators and prove their consistency for protection and validity of uncertainty estimates under shifts. By experimenting on several large-scale benchmark datasets, including Recht et al.s CIFAR-v4 and ImageNet-V2 datasets, we provide complementary empirical results that highlight the importance of robust predictive validity.

Machine Learning Machine Learning Methodology

A Dataset of Fact-Checked Images Shared on WhatsApp During the Brazilian and Indian Elections

85 - Julio C. S. Reis , Philipe de Freitas Melo , Kiran Garimella 2020

Recently, messaging applications, such as WhatsApp, have been reportedly abused by misinformation campaigns, especially in Brazil and India. A notable form of abuse in WhatsApp relies on several manipulated images and memes containing all kinds of fake stories. In this work, we performed an extensive data collection from a large set of WhatsApp publicly accessible groups and fact-checking agency websites. This paper opens a novel dataset to the research community containing fact-checked fake images shared through WhatsApp for two distinct scenarios known for the spread of fake news on the platform: the 2018 Brazilian elections and the 2019 Indian elections.

Computers and Society Social and Information Networks

Mind the Gap: Cake Cutting With Separation

103 - Edith Elkind , Erel Segal-Halevi , Warut Suksompong 2020

We study the problem of fairly allocating a divisible resource, also known as cake cutting, with an additional requirement that the shares that different agents receive should be sufficiently separated from one another. This captures, for example, constraints arising from social distancing guidelines. While it is sometimes impossible to allocate a proportional share to every agent under the separation requirement, we show that the well-known criterion of maximin share fairness can always be attained. We then establish several computational properties of maximin share fairness -- for instance, the maximin share of an agent cannot be computed exactly by any finite algorithm, but can be approximated with an arbitrarily small error. In addition, we consider the division of a pie (i.e., a circular cake) and show that an ordinal relaxation of maximin share fairness can be achieved.

Computer Science and Game Theory

Examining the tech stacks of Czech and Slovak untrustworthy websites

61 - Jozef Michal Mintal , Anna Macko , Marko Pav{l}a 2021

The burgeoning of misleading or false information spread by untrustworthy websites has, without doubt, created a dangerous concoction. Thus, it is not a surprise that the threat posed by untrustworthy websites has emerged as a central concern on the public agenda in many countries, including Czechia and Slovakia. However, combating this harmful phenomenon has proven to be difficult, with approaches primarily focusing on tackling consequences instead of prevention, as websites are routinely seen as quasi-sovereign organisms. Websites, however, rely upon a host of service providers, which, in a way, hold substantial power over them. Notwithstanding the apparent power hold by such tech stack layers, scholarship on this topic remains largely limited. This article contributes to this small body of knowledge by providing a first-of-its-kind systematic mapping of the back-end infrastructural support that makes up the tech stacks of Czech and Slovak untrustworthy websites. Our approach is based on collecting and analyzing data on top-level domain operators, domain name Registrars, email providers, web hosting providers, and utilized website tracking technologies of 150 Czech and Slovak untrustworthy websites. Our findings show that the Czech and Slovak untrustworthy website landscape relies on a vast number of back-end services spread across multiple countries, but in key tech stack layers is nevertheless still heavily dominated by locally based companies. Finally, given our findings, we discuss various possible avenues of utilizing the numeral tech stack layers in combating online disinformation.

Computers and Society

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Mind the Performance Gap: Examining Dataset Shift During Prospective Validation

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions