Why did the distribution change?

345 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Kailash Budhathoki

تاريخ النشر 2021

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Kailash Budhathoki - Dominik Janzing - Patrick Bloebaum

المنهجية الذكاء الاصطناعي التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We describe a formal approach based on graphical causal models to identify the root causes of the change in the probability distribution of variables. After factorizing the joint distribution into conditional distributions of each variable, given its parents (the causal mechanisms), we attribute the change to changes of these causal mechanisms. This attribution analysis accounts for the fact that mechanisms often change independently and sometimes only some of them change. Through simulations, we study the performance of our distribution change attribution method. We then present a real-world case study identifying the drivers of the difference in the income distribution between men and women.

قيم البحث

107 - Lingjiao Chen , Tracy Cai , Matei Zaharia 2021

Machine learning (ML) prediction APIs are increasingly widely used. An ML API can change over time due to model updates or retraining. This presents a key challenge in the usage of the API because it is often not clear to the user if and how the ML m odel has changed. Model shifts can affect downstream application performance and also create oversight issues (e.g. if consistency is desired). In this paper, we initiate a systematic investigation of ML API shifts. We first quantify the performance shifts from 2020 to 2021 of popular ML APIs from Google, Microsoft, Amazon, and others on a variety of datasets. We identified significant model shifts in 12 out of 36 cases we investigated. Interestingly, we found several datasets where the APIs predictions became significantly worse over time. This motivated us to formulate the API shift assessment problem at a more fine-grained level as estimating how the API models confusion matrix changes over time when the data distribution is constant. Monitoring confusion matrix shifts using standard random sampling can require a large number of samples, which is expensive as each API call costs a fee. We propose a principled adaptive sampling algorithm, MASA, to efficiently estimate confusion matrix shifts. MASA can accurately estimate the confusion matrix shifts in commercial ML APIs using up to 90% fewer samples compared to random sampling. This work establishes ML API shifts as an important problem to study and provides a cost-effective approach to monitor such shifts.

التعلم الالي الذكاء الاصطناعي التعلم الآلي

Why Halley did not discover proper motion and why Cassini did

75 - Frank Verbunt , Marc van der Sluys 2019

In 1717 Halley compared contemporaneous measurements of the latitudes of four stars with earlier measurements by ancient Greek astronomers and by Brahe, and from the differences concluded that these four stars showed proper motion. An analysis with m odern methods shows that the data used by Halley do not contain significant evidence for proper motion. What Halley found are the measurement errors of Ptolemaios and Brahe. Halley further argued that the occultation of Aldebaran by the Moon on 11 March 509 in Athens confirmed the change in latitude of Aldebaran. In fact, however, the relevant observation was almost certainly made in Alexandria where Aldebaran was not occulted. By carefully considering measurement errors Jacques Cassini showed that Halleys results from comparison with earlier astronomers were spurious, a conclusion partially confirmed by various later authors. Cassinis careful study of the measurements of the latitude of Arcturus provides the first significant evidence for proper motion.

تاريخ وفلسفة الفيزياء الفيزياء الفلكية الشمسية والنجوم

Why did Comet 17P/Holmes burst out?

390 - W. J. Altenhoff , E. Kreysa , K. M. Menten 2009

Based on millimeter-wavelength continuum observations we suggest that the recent spectacle of comet 17P/Holmes can be explained by a thick, air-tight dust cover and the effects of H2O sublimation, which started when the comet arrived at the heliocent ric distance <= 2.5 AU. The porous structure inside the nucleus provided enough surface for additional sublimation, which eventually led to the break up of the dust cover and to the observed outburst. The magnitude of the particle burst can be explained by the energy provided by insolation, stored in the dust cover and the nucleus within the months before the outburst: the subliming surface within the nucleus is more than one order of magnitude larger than the geometric surface of the nucleus -- possibly an indication of the latters porous structure. Another surprise is that the abundance ratios of several molecular species with respect to H2O are variable. During this apparition, comet Holmes lost about 3% of its mass, corresponding to a dirty ice layer of 20m.

الأراضي والفيزياء الفلكية الكلية

Did the Model Understand the Question?

109 - Pramod Kaushik Mudrakarta , Ankur Taly , Mukund Sundararajan 2018

We analyze state-of-the-art deep learning models for three tasks: question answering on (1) images, (2) tables, and (3) passages of text. Using the notion of emph{attribution} (word importance), we find that these deep networks often ignore important question terms. Leveraging such behavior, we perturb questions to craft a variety of adversarial examples. Our strongest attacks drop the accuracy of a visual question answering model from $61.1%$ to $19%$, and that of a tabular question answering model from $33.5%$ to $3.3%$. Additionally, we show how attributions can strengthen attacks proposed by Jia and Liang (2017) on paragraph comprehension models. Our results demonstrate that attributions can augment standard measures of accuracy and empower investigation of model performance. When a model is accurate but for the wrong reasons, attributions can surface erroneous logic in the model that indicates inadequacies in the test data.

الحساب واللغة الذكاء الاصطناعي

Why Did They #Unfollow Me? Early Detection of Follower Loss on Twitter

95 - Suman Kalyan Maity , Ramanth Gajula , Animesh Mukherjee 2018

Having more followers has become a norm in recent social media and micro-blogging communities. This battle has been taking shape from the early days of Twitter. Despite this strong competition for followers, many Twitter users are continuously losing their followers. This work addresses the problem of identifying the reasons behind the drop of followers of users in Twitter. As a first step, we extract various features by analyzing the content of the posts made by the Twitter users who lose followers consistently. We then leverage these features to early detect follower loss. We propose various models and yield an overall accuracy of 73% with high precision and recall. Our model outperforms baseline model by 19.67% (w.r.t accuracy), 33.8% (w.r.t precision) and 14.3% (w.r.t recall).

الشبكات الاجتماعية والمعلومات