No Arabic abstract
Public special events, like sports games, concerts and festivals are well known to create disruptions in transportation systems, often catching the operators by surprise. Although these are usually planned well in advance, their impact is difficult to predict, even when organisers and transportation operators coordinate. The problem highly increases when several events happen concurrently. To solve these problems, costly processes, heavily reliant on manual search and personal experience, are usual practice in large cities like Singapore, London or Tokyo. This paper presents a Bayesian additive model with Gaussian process components that combines smart card records from public transport with context information about events that is continuously mined from the Web. We develop an efficient approximate inference algorithm using expectation propagation, which allows us to predict the total number of public transportation trips to the special event areas, thereby contributing to a more adaptive transportation system. Furthermore, for multiple concurrent event scenarios, the proposed algorithm is able to disaggregate gross trip counts into their most likely components related to specific events and routine behavior. Using real data from Singapore, we show that the presented model outperforms the best baseline model by up to 26% in R2 and also has explanatory power for its individual components.
Many special events, including sport games and concerts, often cause surges in demand and congestion for transit systems. Therefore, it is important for transit providers to understand their impact on disruptions, delays, and fare revenues. This paper proposes a suite of data-driven techniques that exploit Automated Fare Collection (AFC) data for evaluating, anticipating, and managing the performance of transit systems during recurring congestion peaks due to special events. This includes an extensive analysis of ridership of the two major stadiums in downtown Atlanta using rail data from the Metropolitan Atlanta Rapid Transit Authority (MARTA). The paper first highlights the ridership predictability at the aggregate level for each station on both event and non-event days. It then presents an unsupervised machine-learning model to cluster passengers and identify which train they are boarding. The model makes it possible to evaluate system performance in terms of fundamental metrics such as the passenger load per train and the wait times of riders. The paper also presents linear regression and random forest models for predicting ridership that are used in combination with historical throughput analysis to forecast demand. Finally, simulations are performed that showcase the potential improvements to wait times and demand matching by leveraging proposed techniques to optimize train frequencies based on forecasted demand.
Self-reinforcing feedback loops in personalization systems are typically caused by users choosing from a limited set of alternatives presented systematically based on previous choices. We propose a Bayesian choice model built on Luce axioms that explicitly accounts for users limited exposure to alternatives. Our model is fair---it does not impose negative bias towards unpresented alternatives, and practical---preference estimates are accurately inferred upon observing a small number of interactions. It also allows efficient sampling, leading to a straightforward online presentation mechanism based on Thompson sampling. Our approach achieves low regret in learning to present upon exploration of only a small fraction of possible presentations. The proposed structure can be reused as a building block in interactive systems, e.g., recommender systems, free of feedback loops.
Interpretability has largely focused on local explanations, i.e. explaining why a model made a particular prediction for a sample. These explanations are appealing due to their simplicity and local fidelity. However, they do not provide information about the general behavior of the model. We propose to leverage model distillation to learn global additive explanations that describe the relationship between input features and model predictions. These global explanations take the form of feature shapes, which are more expressive than feature attributions. Through careful experimentation, we show qualitatively and quantitatively that global additive explanations are able to describe model behavior and yield insights about models such as neural nets. A visualization of our approach applied to a neural net as it is trained is available at https://youtu.be/ErQYwNqzEdc.
Many applications of machine learning involve the analysis of large data frames-matrices collecting heterogeneous measurements (binary, numerical, counts, etc.) across samples-with missing values. Low-rank models, as studied by Udell et al. [30], are popular in this framework for tasks such as visualization, clustering and missing value imputation. Yet, available methods with statistical guarantees and efficient optimization do not allow explicit modeling of main additive effects such as row and column, or covariate effects. In this paper, we introduce a low-rank interaction and sparse additive effects (LORIS) model which combines matrix regression on a dictionary and low-rank design, to estimate main effects and interactions simultaneously. We provide statistical guarantees in the form of upper bounds on the estimation error of both components. Then, we introduce a mixed coordinate gradient descent (MCGD) method which provably converges sub-linearly to an optimal solution and is computationally efficient for large scale data sets. We show on simulated and survey data that the method has a clear advantage over current practices, which consist in dealing separately with additive effects in a preprocessing step.
We propose a hierarchical Bayesian recurrent state space model for modeling switching network connectivity in resting state fMRI data. Our model allows us to uncover shared network patterns across disease conditions. We evaluate our method on the ADNI2 dataset by inferring latent state patterns corresponding to altered neural circuits in individuals with Mild Cognitive Impairment (MCI). In addition to states shared across healthy and individuals with MCI, we discover latent states that are predominantly observed in individuals with MCI. Our model outperforms current state of the art deep learning method on ADNI2 dataset.