No Arabic abstract
This paper focuses on the expected difference in borrowers repayment when there is a change in the lenders credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.
The paper examines the potential of deep learning to support decisions in financial risk management. We develop a deep learning model for predicting whether individual spread traders secure profits from future trades. This task embodies typical modeling challenges faced in risk and behavior forecasting. Conventional machine learning requires data that is representative of the feature-target relationship and relies on the often costly development, maintenance, and revision of handcrafted features. Consequently, modeling highly variable, heterogeneous patterns such as trader behavior is challenging. Deep learning promises a remedy. Learning hierarchical distributed representations of the data in an automatic manner (e.g. risk taking behavior), it uncovers generative features that determine the target (e.g., traders profitability), avoids manual feature engineering, and is more robust toward change (e.g. dynamic market conditions). The results of employing a deep network for operational risk forecasting confirm the feature learning capability of deep learning, provide guidance on designing a suitable network architecture and demonstrate the superiority of deep learning over machine learning and rule-based benchmarks.
An approach to the modelling of volatile time series using a class of uniformity-preserving transforms for uniform random variables is proposed. V-transforms describe the relationship between quantiles of the stationary distribution of the time series and quantiles of the distribution of a predictable volatility proxy variable. They can be represented as copulas and permit the formulation and estimation of models that combine arbitrary marginal distributions with copula processes for the dynamics of the volatility proxy. The idea is illustrated using a Gaussian ARMA copula process and the resulting model is shown to replicate many of the stylized facts of financial return series and to facilitate the calculation of marginal and conditional characteristics of the model including quantile measures of risk. Estimation is carried out by adapting the exact maximum likelihood approach to the estimation of ARMA processes and the model is shown to be competitive with standard GARCH in an empirical application to Bitcoin return data.
This paper derives time-uniform confidence sequences (CS) for causal effects in experimental and observational settings. A confidence sequence for a target parameter $psi$ is a sequence of confidence intervals $(C_t)_{t=1}^infty$ such that every one of these intervals simultaneously captures $psi$ with high probability. Such CSs provide valid statistical inference for $psi$ at arbitrary stopping times, unlike classical fixed-time confidence intervals which require the sample size to be fixed in advance. Existing methods for constructing CSs focus on the nonasymptotic regime where certain assumptions (such as known bounds on the random variables) are imposed, while doubly robust estimators of causal effects rely on (asymptotic) semiparametric theory. We use sequenti
Crime prevention strategies based on early intervention depend on accurate risk assessment instruments for identifying high risk youth. It is important in this context that the instruments be convenient to administer, which means, in particular, that they must be reasonably brief; adaptive screening tests are useful for this purpose. Although item response theory (IRT) bears a long and rich history in producing reliable adaptive tests, adaptive tests constructed using classification and regression trees are becoming a popular alternative to the traditional IRT approach for item selection. On the upside, unlike IRT, tree-based questionnaires require no real-time parameter estimation during administration. On the downside, while item response theory provides robust criteria for terminating the exam, the stopping criterion for a tree-based adaptive test (the maximum tree depth) is unclear. We present a Bayesian decision theory approach for characterizing the trade-offs of administering tree-based questionnaires of different lengths. This formalism involves specifying 1) a utility function measuring the goodness of the assessment; 2) a target population over which this utility should be maximized; 3) an action space comprised of different-length assessments, populated via a tree-fitting algorithm. Using this framework, we provide uncertainty estimates for the trade-offs of shortening the exam, allowing practitioners to determine an optimal exam length in a principled way. The method is demonstrated through an application to youth delinquency risk assessment in Honduras.
Predictive models -- learned from observational data not covering the complete data distribution -- can rely on spurious correlations in the data for making predictions. These correlations make the models brittle and hinder generalization. One solution for achieving strong generalization is to incorporate causal structures in the models; such structures constrain learning by ignoring correlations that contradict them. However, learning these structures is a hard problem in itself. Moreover, its not clear how to incorporate the machinery of causality with online continual learning. In this work, we take an indirect approach to discovering causal models. Instead of searching for the true causal model directly, we propose an online algorithm that continually detects and removes spurious features. Our algorithm works on the idea that the correlation of a spurious feature with a target is not constant over-time. As a result, the weight associated with that feature is constantly changing. We show that by continually removing such features, our method converges to solutions that have strong generalization. Moreover, our method combined with random search can also discover non-spurious features from raw sensory data. Finally, our work highlights that the information present in the temporal structure of the problem -- destroyed by shuffling the data -- is essential for detecting spurious features online.