ترغب بنشر مسار تعليمي؟ اضغط هنا

Why did the distribution change?

345   0   0.0 ( 0 )
 نشر من قبل Kailash Budhathoki
 تاريخ النشر 2021
والبحث باللغة English




اسأل ChatGPT حول البحث

We describe a formal approach based on graphical causal models to identify the root causes of the change in the probability distribution of variables. After factorizing the joint distribution into conditional distributions of each variable, given its parents (the causal mechanisms), we attribute the change to changes of these causal mechanisms. This attribution analysis accounts for the fact that mechanisms often change independently and sometimes only some of them change. Through simulations, we study the performance of our distribution change attribution method. We then present a real-world case study identifying the drivers of the difference in the income distribution between men and women.



قيم البحث

اقرأ أيضاً

Machine learning (ML) prediction APIs are increasingly widely used. An ML API can change over time due to model updates or retraining. This presents a key challenge in the usage of the API because it is often not clear to the user if and how the ML m odel has changed. Model shifts can affect downstream application performance and also create oversight issues (e.g. if consistency is desired). In this paper, we initiate a systematic investigation of ML API shifts. We first quantify the performance shifts from 2020 to 2021 of popular ML APIs from Google, Microsoft, Amazon, and others on a variety of datasets. We identified significant model shifts in 12 out of 36 cases we investigated. Interestingly, we found several datasets where the APIs predictions became significantly worse over time. This motivated us to formulate the API shift assessment problem at a more fine-grained level as estimating how the API models confusion matrix changes over time when the data distribution is constant. Monitoring confusion matrix shifts using standard random sampling can require a large number of samples, which is expensive as each API call costs a fee. We propose a principled adaptive sampling algorithm, MASA, to efficiently estimate confusion matrix shifts. MASA can accurately estimate the confusion matrix shifts in commercial ML APIs using up to 90% fewer samples compared to random sampling. This work establishes ML API shifts as an important problem to study and provides a cost-effective approach to monitor such shifts.
In 1717 Halley compared contemporaneous measurements of the latitudes of four stars with earlier measurements by ancient Greek astronomers and by Brahe, and from the differences concluded that these four stars showed proper motion. An analysis with m odern methods shows that the data used by Halley do not contain significant evidence for proper motion. What Halley found are the measurement errors of Ptolemaios and Brahe. Halley further argued that the occultation of Aldebaran by the Moon on 11 March 509 in Athens confirmed the change in latitude of Aldebaran. In fact, however, the relevant observation was almost certainly made in Alexandria where Aldebaran was not occulted. By carefully considering measurement errors Jacques Cassini showed that Halleys results from comparison with earlier astronomers were spurious, a conclusion partially confirmed by various later authors. Cassinis careful study of the measurements of the latitude of Arcturus provides the first significant evidence for proper motion.
Based on millimeter-wavelength continuum observations we suggest that the recent spectacle of comet 17P/Holmes can be explained by a thick, air-tight dust cover and the effects of H2O sublimation, which started when the comet arrived at the heliocent ric distance <= 2.5 AU. The porous structure inside the nucleus provided enough surface for additional sublimation, which eventually led to the break up of the dust cover and to the observed outburst. The magnitude of the particle burst can be explained by the energy provided by insolation, stored in the dust cover and the nucleus within the months before the outburst: the subliming surface within the nucleus is more than one order of magnitude larger than the geometric surface of the nucleus -- possibly an indication of the latters porous structure. Another surprise is that the abundance ratios of several molecular species with respect to H2O are variable. During this apparition, comet Holmes lost about 3% of its mass, corresponding to a dirty ice layer of 20m.
We analyze state-of-the-art deep learning models for three tasks: question answering on (1) images, (2) tables, and (3) passages of text. Using the notion of emph{attribution} (word importance), we find that these deep networks often ignore important question terms. Leveraging such behavior, we perturb questions to craft a variety of adversarial examples. Our strongest attacks drop the accuracy of a visual question answering model from $61.1%$ to $19%$, and that of a tabular question answering model from $33.5%$ to $3.3%$. Additionally, we show how attributions can strengthen attacks proposed by Jia and Liang (2017) on paragraph comprehension models. Our results demonstrate that attributions can augment standard measures of accuracy and empower investigation of model performance. When a model is accurate but for the wrong reasons, attributions can surface erroneous logic in the model that indicates inadequacies in the test data.
Having more followers has become a norm in recent social media and micro-blogging communities. This battle has been taking shape from the early days of Twitter. Despite this strong competition for followers, many Twitter users are continuously losing their followers. This work addresses the problem of identifying the reasons behind the drop of followers of users in Twitter. As a first step, we extract various features by analyzing the content of the posts made by the Twitter users who lose followers consistently. We then leverage these features to early detect follower loss. We propose various models and yield an overall accuracy of 73% with high precision and recall. Our model outperforms baseline model by 19.67% (w.r.t accuracy), 33.8% (w.r.t precision) and 14.3% (w.r.t recall).

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا