No Arabic abstract
Womens football is gaining supporters and practitioners worldwide, raising questions about what the differences are with mens football. While the two sports are often compared based on the players physical attributes, we analyze the spatio-temporal events during matches in the last World Cups to compare male and female teams based on their technical performance. We train an artificial intelligence model to recognize if a team is male or female based on variables that describe a matchs playing intensity, accuracy, and performance quality. Our model accurately distinguishes between mens and womens football, revealing crucial technical differences, which we investigate through the extraction of explanations from the classifiers decisions. The differences between mens and womens football are rooted in play accuracy, the recovery time of ball possession, and the players performance quality. Our methodology may help journalists and fans understand what makes womens football a distinct sport and coaches design tactics tailored to female teams.
To investigate whether training load monitoring data could be used to predict injuries in elite Australian football players, data were collected from elite athletes over 3 seasons at an Australian football club. Loads were quantified using GPS devices, accelerometers and player perceived exertion ratings. Absolute and relative training load metrics were calculated for each player each day (rolling average, exponentially weighted moving average, acute:chronic workload ratio, monotony and strain). Injury prediction models (regularised logistic regression, generalised estimating equations, random forests and support vector machines) were built for non-contact, non-contact time-loss and hamstring specific injuries using the first two seasons of data. Injury predictions were generated for the third season and evaluated using the area under the receiver operator characteristic (AUC). Predictive performance was only marginally better than chance for models of non-contact and non-contact time-loss injuries (AUC$<$0.65). The best performing model was a multivariate logistic regression for hamstring injuries (best AUC=0.76). Learning curves suggested logistic regression was underfitting the load-injury relationship and that using a more complex model or increasing the amount of model building data may lead to future improvements. Injury prediction models built using training load data from a single club showed poor ability to predict injuries when tested on previously unseen data, suggesting they are limited as a daily decision tool for practitioners. Focusing the modelling approach on specific injury types and increasing the amount of training data may lead to the development of improved predictive models for injury prevention.
Predicting pregnancy has been a fundamental problem in womens health for more than 50 years. Previous datasets have been collected via carefully curated medical studies, but the recent growth of womens health tracking mobile apps offers potential for reaching a much broader population. However, the feasibility of predicting pregnancy from mobile health tracking data is unclear. Here we develop four models -- a logistic regression model, and 3 LSTM models -- to predict a womans probability of becoming pregnant using data from a womens health tracking app, Clue by BioWink GmbH. Evaluating our models on a dataset of 79 million logs from 65,276 women with ground truth pregnancy test data, we show that our predicted pregnancy probabilities meaningfully stratify women: women in the top 10% of predicted probabilities have a 89% chance of becoming pregnant over 6 menstrual cycles, as compared to a 27% chance for women in the bottom 10%. We develop a technique for extracting interpretable time trends from our deep learning models, and show these trends are consistent with previous fertility research. Our findings illustrate the potential that womens health tracking data offers for predicting pregnancy on a broader population; we conclude by discussing the steps needed to fulfill this potential.
We propose a versatile joint regression framework for count responses. The method is implemented in the R add-on package GJRM and allows for modelling linear and non-linear dependence through the use of several copulae. Moreover, the parameters of the marginal distributions of the count responses and of the copula can be specified as flexible functions of covariates. Motivated by a football application, we also discuss an extension which forces the regression coefficients of the marginal (linear) predictors to be equal via a suitable penalisation. Model fitting is based on a trust region algorithm which estimates simultaneously all the parameters of the joint models. We investigate the proposals empirical performance in two simulation studies, the first one designed for arbitrary count data, the other one reflecting football-specific settings. Finally, the method is applied to FIFA World Cup data, showing its competitiveness to the standard approach with regard to predictive performance.
American football is the most popular high school sport and is among the leading cause of injury among adolescents. While there has been considerable recent attention on the link between football and cognitive decline, there is also evidence of higher than expected rates of pain, obesity, and lower quality of life among former professional players, either as a result of repetitive head injury or through different mechanisms. Previously hidden downstream effects of playing football may have far-reaching public health implications for participants in youth and high school football programs. Our proposed study is a retrospective observational study that compares 1,153 high school males who played varsity football with 2,751 male students who did not. 1,951 of the control subjects did not play any sport and the remaining 800 controls played a non-contact sport. Our primary outcome is self-rated health measured at age 65. To control for potential confounders, we adjust for pre-exposure covariates with matching and model-based covariance adjustment. We will conduct an ordered testing procedure designed to use the full pool of 2,751 controls while also controlling for possible unmeasured differences between students who played sports and those who did not. We will quantitatively assess the sensitivity of the results to potential unmeasured confounding. The study will also assess secondary outcomes of pain, difficulty with activities of daily living, and obesity, as these are both important to individual well-being and have public health relevance.
More than 1 million students play high school American football annually, but many health professionals have recently questioned its safety or called for its ban. These concerns have been partially driven by reports of chronic traumatic encephalopathy (CTE), increased risks of neurodegenerative disease, and associations between concussion history and later-life cognitive impairment and depression among retired professional football players. A recent observational study of a cohort of men who graduated from a Wisconsin high school in 1957 found no statistically significant harmful effects of playing high school football on a range of cognitive, psychological, and socio-economic outcomes measured at ages 35, 54, 65, and 72. Unfortunately, these findings may not generalize to younger populations, thanks to changes and improvements in football helmet technology and training techniques. In particular, these changes may have led to increased perceptions of safety but ultimately more dangerous styles of play, characterized by the frequent sub-concussive impacts thought to be associated with later-life neurological decline. In this work, we replicate the methodology of that earlier matched observational study using data from the National Longitudinal Study of Adolescent to Adult Health (Add Health). These include adolescent and family co-morbidities, academic experience, self-reported levels of general health and physical activity, and the score on the Add Health Picture Vocabulary Test. Our primary outcome is the CES-D score measured in 2008 when subjects were aged 24 -- 34 and settling into early adulthood. We also examine several secondary outcomes related to physical and psychological health, including suicidality. Our results can provide insight into the natural history of potential football-related decline and dysfunction.