Modeling Stated Preference for Mobility-on-Demand Transit: A Comparison of Machine Learning and Logit Models


Abstract in English

Logit models are usually applied when studying individual travel behavior, i.e., to predict travel mode choice and to gain behavioral insights on traveler preferences. Recently, some studies have applied machine learning to model travel mode choice and reported higher out-of-sample predictive accuracy than traditional logit models (e.g., multinomial logit). However, little research focuses on comparing the interpretability of machine learning with logit models. In other words, how to draw behavioral insights from the high-performance black-box machine-learning models remains largely unsolved in the field of travel behavior modeling. This paper aims at providing a comprehensive comparison between the two approaches by examining the key similarities and differences in model development, evaluation, and behavioral interpretation between logit and machine-learning models for travel mode choice modeling. To complement the theoretical discussions, the paper also empirically evaluates the two approaches on the stated-preference survey data for a new type of transit system integrating high-frequency fixed-route services and ridesourcing. The results show that machine learning can produce significantly higher predictive accuracy than logit models. Moreover, machine learning and logit models largely agree on many aspects of behavioral interpretations. In addition, machine learning can automatically capture the nonlinear relationship between the input features and choice outcomes. The paper concludes that there is great potential in merging ideas from machine learning and conventional statistical methods to develop refined models for travel behavior research and suggests some new research directions.

Download