Introducing Super Pseudo Panels: Application to Transport Preference Dynamics


الملخص بالإنكليزية

We propose a new approach for constructing synthetic pseudo-panel data from cross-sectional data. The pseudo panel and the preferences it intends to describe is constructed at the individual level and is not affected by aggregation bias across cohorts. This is accomplished by creating a high-dimensional probabilistic model representation of the entire data set, which allows sampling from the probabilistic model in such a way that all of the intrinsic correlation properties of the original data are preserved. The key to this is the use of deep learning algorithms based on the Conditional Variational Autoencoder (CVAE) framework. From a modelling perspective, the concept of a model-based resampling creates a number of opportunities in that data can be organized and constructed to serve very specific needs of which the forming of heterogeneous pseudo panels represents one. The advantage, in that respect, is the ability to trade a serious aggregation bias (when aggregating into cohorts) for an unsystematic noise disturbance. Moreover, the approach makes it possible to explore high-dimensional sparse preference distributions and their linkage to individual specific characteristics, which is not possible if applying traditional pseudo-panel methods. We use the presented approach to reveal the dynamics of transport preferences for a fixed pseudo panel of individuals based on a large Danish cross-sectional data set covering the period from 2006 to 2016. The model is also utilized to classify individuals into slow and fast movers with respect to the speed at which their preferences change over time. It is found that the prototypical fast mover is a young woman who lives as a single in a large city whereas the typical slow mover is a middle-aged man with high income from a nuclear family who lives in a detached house outside a city.

تحميل البحث