No Arabic abstract
The Random Utility Maximization model is by far the most adopted framework to estimate consumer choice behavior. However, behavioral economics has provided strong empirical evidence of irrational choice behavior, such as halo effects, that are incompatible with this framework. Models belonging to the Random Utility Maximization family may therefore not accurately capture such irrational behavior. Hence, more general choice models, overcoming such limitations, have been proposed. However, the flexibility of such models comes at the price of increased risk of overfitting. As such, estimating such models remains a challenge. In this work, we propose an estimation method for the recently proposed Generalized Stochastic Preference choice model, which subsumes the family of Random Utility Maximization models and is capable of capturing halo effects. Specifically, we show how to use partially-ranked preferences to efficiently model rational and irrational customer types from transaction data. Our estimation procedure is based on column generation, where relevant customer types are efficiently extracted by expanding a tree-like data structure containing the customer behaviors. Further, we propose a new dominance rule among customer types whose effect is to prioritize low orders of interactions among products. An extensive set of experiments assesses the predictive accuracy of the proposed approach. Our results show that accounting for irrational preferences can boost predictive accuracy by 12.5% on average, when tested on a real-world dataset from a large chain of grocery and drug stores.
We study the impact of weak identification in discrete choice models, and provide insights into the determinants of identification strength in these models. Using these insights, we propose a novel test that can consistently detect weak identification in commonly applied discrete choice models, such as probit, logit, and many of their extensions. Furthermore, we demonstrate that when the null hypothesis of weak identification is rejected, Wald-based inference can be carried out using standard formulas and critical values. A Monte Carlo study compares our proposed testing approach against commonly applied weak identification tests. The results simultaneously demonstrate the good performance of our approach and the fundamental failure of using conventional weak identification tests for linear models in the discrete choice model context. Furthermore, we compare our approach against those commonly applied in the literature in two empirical examples: married women labor force participation, and US food aid and civil conflicts.
In nonlinear panel data models, fixed effects methods are often criticized because they cannot identify average marginal effects (AMEs) in short panels. The common argument is that the identification of AMEs requires knowledge of the distribution of unobserved heterogeneity, but this distribution is not identified in a fixed effects model with a short panel. In this paper, we derive identification results that contradict this argument. In a panel data dynamic logic model, and for T as small as four, we prove the point identification of different AMEs, including causal effects of changes in the lagged dependent variable or in the duration in last choice. Our proofs are constructive and provide simple closed-form expressions for the AMEs in terms of probabilities of choice histories. We illustrate our results using Monte Carlo experiments and with an empirical application of a dynamic structural model of consumer brand choice with state dependence.
Since its inception, the choice modelling field has been dominated by theory-driven models. The recent emergence and growing popularity of machine learning models offer an alternative data-driven approach. Machine learning models, techniques and practices could help overcome problems and limitations of the current theory-driven modelling paradigm, e.g. relating to the ad-hocness in search for the optimal model specification, and theory-driven choice models inability to work with text and image data. However, despite the potential value of machine learning to improve choice modelling practices, the choice modelling field has been somewhat hesitant to embrace machine learning. The aim of this paper is to facilitate (further) integration of machine learning in the choice modelling field. To achieve this objective, we make the case that (further) integration of machine learning in the choice modelling field is beneficial for the choice modelling field, and, we shed light on where the benefits of further integration can be found. Specifically, we take the following approach. First, we clarify the similarities and differences between the two modelling paradigms. Second, we provide a literature overview on the use of machine learning for choice modelling. Third, we reinforce the strengths of the current theory-driven modelling paradigm and compare this with the machine learning modelling paradigm, Fourth, we identify opportunities for embracing machine learning for choice modelling, while recognising the strengths of the current theory-driven paradigm. Finally, we put forward a vision on the future relationship between the theory-driven choice models and machine learning.
Human decision making underlies data generating process in multiple application areas, and models explaining and predicting choices made by individuals are in high demand. Discrete choice models are widely studied in economics and computational social sciences. As digital social networking facilitates information flow and spread of influence between individuals, new advances in modeling are needed to incorporate social information into these models in addition to characteristic features affecting individual choices. In this paper, we propose two novel models with scalable training algorithms: local logistics graph regularization (LLGR) and latent class graph regularization (LCGR) models. We add social regularization to represent similarity between friends, and we introduce latent classes to account for possible preference discrepancies between different social groups. Training of the LLGR model is performed using alternating direction method of multipliers (ADMM), and training of the LCGR model is performed using a specialized Monte Carlo expectation maximization (MCEM) algorithm. Scalability to large graphs is achieved by parallelizing computation in both the expectation and the maximization steps. The LCGR model is the first latent class classification model that incorporates social relationships among individuals represented by a given graph. To evaluate our two models, we consider three classes of data to illustrate a typical large-scale use case in internet and social media applications. We experiment on synthetic datasets to empirically explain when the proposed model is better than vanilla classification models that do not exploit graph structure. We also experiment on real-world data, including both small scale and large scale real-world datasets, to demonstrate on which types of datasets our model can be expected to outperform state-of-the-art models.
We propose an estimation procedure for discrete choice models of differentiated products with possibly high-dimensional product attributes. In our model, high-dimensional attributes can be determinants of both mean and variance of the indirect utility of a product. The key restriction in our model is that the high-dimensional attributes affect the variance of indirect utilities only through finitely many indices. In a framework of the random-coefficients logit model, we show a bound on the error rate of a $l_1$-regularized minimum distance estimator and prove the asymptotic linearity of the de-biased estimator.