ترغب بنشر مسار تعليمي؟ اضغط هنا

UEFA EURO 2020 Forecast via Nested Zero-Inflated Generalized Poisson Regression

45   0   0.0 ( 0 )
 نشر من قبل Lorenz Gilch
 تاريخ النشر 2021
  مجال البحث الاحصاء الرياضي
والبحث باللغة English
 تأليف Lorenz A. Gilch




اسأل ChatGPT حول البحث

This report is devoted to the forecast of the UEFA EURO 2020, Europes continental football championship, taking place across Europe in June/July 2021. We present the simulation results for this tournament, where the simulations are based on a zero-inflated generalized Poisson regression model that includes the Elo points of the participating teams and the location of the matches as covariates and incorporates differences of team-specific skills. The proposed model allows predictions in terms of probabilities in order to quantify the chances for each team to reach a certain stage of the tournament. We use Monte Carlo simulations for estimating the outcome of each single match of the tournament, from which we are able to simulate the whole tournament itself. The model is fitted on all football games of the participating teams since 2014 weighted by date and importance.



قيم البحث

اقرأ أيضاً

Although basketball is a dynamic process sport, with 5 plus 5 players competing on both offense and defense simultaneously, learning some static information is predominant for professional players, coaches and team mangers. In order to have a deep un derstanding of field goal attempts among different players, we propose a zero inflated Poisson model with clustered regression coefficients to learn the shooting habits of different players over the court and the heterogeneity among them. Specifically, the zero inflated model recovers the large proportion of the court with zero field goal attempts, and the mixture of finite mixtures model learn the heterogeneity among different players based on clustered regression coefficients and inflated probabilities. Both theoretical and empirical justification through simulation studies validate our proposed method. We apply our proposed model to the National Basketball Association (NBA), for learning players shooting habits and heterogeneity among different players over the 2017--2018 regular season. This illustrates our model as a way of providing insights from different aspects.
Microorganisms play critical roles in human health and disease. It is well known that microbes live in diverse communities in which they interact synergistically or antagonistically. Thus for estimating microbial associations with clinical covariates , multivariate statistical models are preferred. Multivariate models allow one to estimate and exploit complex interdependencies among multiple taxa, yielding more powerful tests of exposure or treatment effects than application of taxon-specific univariate analyses. In addition, the analysis of microbial count data requires special attention because data commonly exhibit zero inflation. To meet these needs, we developed a Bayesian variable selection model for multivariate count data with excess zeros that incorporates information on the covariance structure of the outcomes (counts for multiple taxa), while estimating associations with the mean levels of these outcomes. Although there has been a great deal of effort in zero-inflated models for longitudinal data, little attention has been given to high-dimensional multivariate zero-inflated data modeled via a general correlation structure. Through simulation, we compared performance of the proposed method to that of existing univariate approaches, for both the binary and count parts of the model. When outcomes were correlated the proposed variable selection method maintained type I error while boosting the ability to identify true associations in the binary component of the model. For the count part of the model, in some scenarios the the univariate method had higher power than the multivariate approach. This higher power was at a cost of a highly inflated false discovery rate not observed with the proposed multivariate method. We applied the approach to oral microbiome data from the Pediatric HIV/AIDS Cohort Oral Health Study and identified five species (of 44) associated with HIV infection.
We propose a nested reduced-rank regression (NRRR) approach in fitting regression model with multivariate functional responses and predictors, to achieve tailored dimension reduction and facilitate interpretation/visualization of the resulting functi onal model. Our approach is based on a two-level low-rank structure imposed on the functional regression surfaces. A global low-rank structure identifies a small set of latent principal functional responses and predictors that drives the underlying regression association. A local low-rank structure then controls the complexity and smoothness of the association between the principal functional responses and predictors. Through a basis expansion approach, the functional problem boils down to an interesting integrated matrix approximation task, where the blocks or submatrices of an integrated low-rank matrix share some common row space and/or column space. An iterative algorithm with convergence guarantee is developed. We establish the consistency of NRRR and also show through non-asymptotic analysis that it can achieve at least a comparable error rate to that of the reduced-rank regression. Simulation studies demonstrate the effectiveness of NRRR. We apply NRRR in an electricity demand problem, to relate the trajectories of the daily electricity consumption with those of the daily temperatures.
In the United States the preferred method of obtaining dietary intake data is the 24-hour dietary recall, yet the measure of most interest is usual or long-term average daily intake, which is impossible to measure. Thus, usual dietary intake is asses sed with considerable measurement error. Also, diet represents numerous foods, nutrients and other components, each of which have distinctive attributes. Sometimes, it is useful to examine intake of these components separately, but increasingly nutritionists are interested in exploring them collectively to capture overall dietary patterns. Consumption of these components varies widely: some are consumed daily by almost everyone on every day, while others are episodically consumed so that 24-hour recall data are zero-inflated. In addition, they are often correlated with each other. Finally, it is often preferable to analyze the amount of a dietary component relative to the amount of energy (calories) in a diet because dietary recommendations often vary with energy level. The quest to understand overall dietary patterns of usual intake has to this point reached a standstill. There are no statistical methods or models available to model such complex multivariate data with its measurement error and zero inflation. This paper proposes the first such model, and it proposes the first workable solution to fit such a model. After describing the model, we use survey-weighted MCMC computations to fit the model, with uncertainty estimation coming from balanced repeated replication.
122 - Daisuke Murakami 2021
This study presents application examples of generalized spatial regression modeling for count data and continuous non-Gaussian data using the spmoran package (version 0.2.2 onward). Section 2 introduces the model. The subsequent sections demonstrate applications of the model for disease mapping, spatial prediction and uncertainty modeling, and hedonic analysis. The R codes used in this vignette are available from https://github.com/dmuraka/spmoran. Another vignette focusing on Gaussian spatial regression modeling is also available from the same GitHub page.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا