No Arabic abstract
Sports are spontaneous generators of stories. Through skill and chance, the script of each game is dynamically written in real time by players acting out possible trajectories allowed by a sports rules. By properly characterizing a given sports ecology of `game stories, we are able to capture the sports capacity for unfolding interesting narratives, in part by contrasting them with random walks. Here, we explore the game story space afforded by a data set of 1,310 Australian Football League (AFL) score lines. We find that AFL games exhibit a continuous spectrum of stories rather than distinct clusters. We show how coarse-graining reveals identifiable motifs ranging from last minute comeback wins to one-sided blowouts. Through an extensive comparison with biased random walks, we show that real AFL games deliver a broader array of motifs than null models, and we provide consequent insights into the narrative appeal of real games.
To investigate whether training load monitoring data could be used to predict injuries in elite Australian football players, data were collected from elite athletes over 3 seasons at an Australian football club. Loads were quantified using GPS devices, accelerometers and player perceived exertion ratings. Absolute and relative training load metrics were calculated for each player each day (rolling average, exponentially weighted moving average, acute:chronic workload ratio, monotony and strain). Injury prediction models (regularised logistic regression, generalised estimating equations, random forests and support vector machines) were built for non-contact, non-contact time-loss and hamstring specific injuries using the first two seasons of data. Injury predictions were generated for the third season and evaluated using the area under the receiver operator characteristic (AUC). Predictive performance was only marginally better than chance for models of non-contact and non-contact time-loss injuries (AUC$<$0.65). The best performing model was a multivariate logistic regression for hamstring injuries (best AUC=0.76). Learning curves suggested logistic regression was underfitting the load-injury relationship and that using a more complex model or increasing the amount of model building data may lead to future improvements. Injury prediction models built using training load data from a single club showed poor ability to predict injuries when tested on previously unseen data, suggesting they are limited as a daily decision tool for practitioners. Focusing the modelling approach on specific injury types and increasing the amount of training data may lead to the development of improved predictive models for injury prevention.
We investigate the relation between the number of passes made by a football team and the number of goals. We analyze the 380 matches of a complete season of the Spanish national league LaLiga (2018/2019). We observe how the number of scored goals is positively correlated with the number of passes made by a team. In this way, teams on the top (bottom) of the ranking at the end of the season make more (less) passes than the rest of the teams. However, we observe a strong asymmetry when the analysis is made depending on the part of the match. Interestingly, fewer passes are made on the second part of a match while, at the same time, more goals are scored. This paradox appears in the majority of teams, and it is independent of the number of passes made. These results confirm that goals in the first part of matches are more costly in terms of passes than those scored on second halves.
Analyzing football score data with statistical techniques, we investigate how the highly co-operative nature of the game is reflected in averaged properties such as the distributions of scored goals for the home and away teams. It turns out that in particular the tails of the distributions are not well described by independent Bernoulli trials, but rather well modeled by negative binomial or generalized extreme value distributions. To understand this behavior from first principles, we suggest to modify the Bernoulli random process to include a simple component of self-affirmation which seems to describe the data surprisingly well and allows to interpret the observed deviation from Gaussian statistics. The phenomenological distributions used before can be understood as special cases within this framework. We analyzed historical football score data from many leagues in Europe as well as from international tournaments and found the proposed models to be applicable rather universally. In particular, here we compare mens and womens leagues and the separate German leagues during the cold war times and find some remarkable differences.
We investigate the modeling capabilities of sets of coupled classical harmonic oscillators (CHO) in the form of a modeling game. The application of simple but restrictive rules of the game lead to conditions for an isomorphism between Lie-algebras and real Clifford algebras. We show that the correlations between two coupled classical oscillators find their natural description in the Dirac algebra and allow to model aspects of special relativity, inertial motion, electromagnetism and quantum phenomena including spin in one go. The algebraic properties of Hamiltonian motion of low-dimensional systems can generally be related to certain types of interactions and hence to the dimensionality of emergent space-times. We describe the intrinsic connection between phase space volumes of a 2-dimensional oscillator and the Dirac algebra. In this version of a phase space interpretation of quantum mechanics the (components of the) spinor wave-function in momentum space are abstract canonical coordinates, and the integrals over the squared wave function represents second moments in phase space. The wave function in ordinary space-time can be obtained via Fourier transformation. Within this modeling game, 3+1-dimensional space-time is interpreted as a structural property of electromagnetic interaction. A generalization selects a series of Clifford algebras of specific dimensions with similar properties, specifically also 10- and 26-dimensional real Clifford algebras.
The rapid progress in artificial intelligence (AI) and machine learning has opened unprecedented analytics possibilities in various team and individual sports, including baseball, basketball, and tennis. More recently, AI techniques have been applied to football, due to a huge increase in data collection by professional teams, increased computational power, and advances in machine learning, with the goal of better addressing new scientific challenges involved in the analysis of both individual players and coordinated teams behaviors. The research challenges associated with predictive and prescriptive football analytics require new developments and progress at the intersection of statistical learning, game theory, and computer vision. In this paper, we provide an overarching perspective highlighting how the combination of these fields, in particular, forms a unique microcosm for AI research, while offering mutual benefits for professional teams, spectators, and broadcasters in the years to come. We illustrate that this duality makes football analytics a game changer of tremendous value, in terms of not only changing the game of football itself, but also in terms of what this domain can mean for the field of AI. We review the state-of-the-art and exemplify the types of analysis enabled by combining the aforementioned fields, including illustrative examples of counterfactual analysis using predictive models, and the combination of game-theoretic analysis of penalty kicks with statistical learning of player attributes. We conclude by highlighting envisioned downstream impacts, including possibilities for extensions to other sports (real and virtual).