ﻻ يوجد ملخص باللغة العربية
We propose a new approach for constructing synthetic pseudo-panel data from cross-sectional data. The pseudo panel and the preferences it intends to describe is constructed at the individual level and is not affected by aggregation bias across cohorts. This is accomplished by creating a high-dimensional probabilistic model representation of the entire data set, which allows sampling from the probabilistic model in such a way that all of the intrinsic correlation properties of the original data are preserved. The key to this is the use of deep learning algorithms based on the Conditional Variational Autoencoder (CVAE) framework. From a modelling perspective, the concept of a model-based resampling creates a number of opportunities in that data can be organized and constructed to serve very specific needs of which the forming of heterogeneous pseudo panels represents one. The advantage, in that respect, is the ability to trade a serious aggregation bias (when aggregating into cohorts) for an unsystematic noise disturbance. Moreover, the approach makes it possible to explore high-dimensional sparse preference distributions and their linkage to individual specific characteristics, which is not possible if applying traditional pseudo-panel methods. We use the presented approach to reveal the dynamics of transport preferences for a fixed pseudo panel of individuals based on a large Danish cross-sectional data set covering the period from 2006 to 2016. The model is also utilized to classify individuals into slow and fast movers with respect to the speed at which their preferences change over time. It is found that the prototypical fast mover is a young woman who lives as a single in a large city whereas the typical slow mover is a middle-aged man with high income from a nuclear family who lives in a detached house outside a city.
In the machine learning domain, active learning is an iterative data selection algorithm for maximizing information acquisition and improving model performance with limited training samples. It is very useful, especially for the industrial applicatio
We introduce a formulation of optimal transport problem for distributions on function spaces, where the stochastic map between functional domains can be partially represented in terms of an (infinite-dimensional) Hilbert-Schmidt operator mapping a Hi
We develop a new model of insulin-glucose dynamics for forecasting blood glucose in type 1 diabetics. We augment an existing biomedical model by introducing time-varying dynamics driven by a machine learning sequence model. Our model maintains a phys
In population synthesis applications, when considering populations with many attributes, a fundamental problem is the estimation of rare combinations of feature attributes. Unsurprisingly, it is notably more difficult to reliably representthe sparser
We consider the problem of estimating a ranking on a set of items from noisy pairwise comparisons given item features. We address the fact that pairwise comparison data often reflects irrational choice, e.g. intransitivity. Our key observation is tha