No Arabic abstract
We develop tools for utilizing correspondence experiments to detect illegal discrimination by individual employers. Employers violate US employment law if their propensity to contact applicants depends on protected characteristics such as race or sex. We establish identification of higher moments of the causal effects of protected characteristics on callback rates as a function of the number of fictitious applications sent to each job ad. These moments are used to bound the fraction of jobs that illegally discriminate. Applying our results to three experimental datasets, we find evidence of significant employer heterogeneity in discriminatory behavior, with the standard deviation of gaps in job-specific callback probabilities across protected groups averaging roughly twice the mean gap. In a recent experiment manipulating racially distinctive names, we estimate that at least 85% of jobs that contact both of two white applications and neither of two black applications are engaged in illegal discrimination. To assess the tradeoff between type I and II errors presented by these patterns, we consider the performance of a series of decision rules for investigating suspicious callback behavior under a simple two-type model that rationalizes the experimental data. Though, in our preferred specification, only 17% of employers are estimated to discriminate on the basis of race, we find that an experiment sending 10 applications to each job would enable accurate detection of 7-10% of discriminators while falsely accusing fewer than 0.2% of non-discriminators. A minimax decision rule acknowledging partial identification of the joint distribution of callback rates yields higher error rates but more investigations than our baseline two-type model. Our results suggest illegal labor market discrimination can be reliably monitored with relatively small modifications to existing audit designs.
Instrumental variables (IV) regression is a popular method for the estimation of the endogenous treatment effects. Conventional IV methods require all the instruments are relevant and valid. However, this is impractical especially in high-dimensional models when we consider a large set of candidate IVs. In this paper, we propose an IV estimator robust to the existence of both the invalid and irrelevant instruments (called R2IVE) for the estimation of endogenous treatment effects. This paper extends the scope of Kang et al. (2016) by considering a true high-dimensional IV model and a nonparametric reduced form equation. It is shown that our procedure can select the relevant and valid instruments consistently and the proposed R2IVE is root-n consistent and asymptotically normal. Monte Carlo simulations demonstrate that the R2IVE performs favorably compared to the existing high-dimensional IV estimators (such as, NAIVE (Fan and Zhong, 2018) and sisVIVE (Kang et al., 2016)) when invalid instruments exist. In the empirical study, we revisit the classic question of trade and growth (Frankel and Romer, 1999).
In non-experimental settings, the Regression Discontinuity (RD) design is one of the most credible identification strategies for program evaluation and causal inference. However, RD treatment effect estimands are necessarily local, making statistical methods for the extrapolation of these effects a key area for development. We introduce a new method for extrapolation of RD effects that relies on the presence of multiple cutoffs, and is therefore design-based. Our approach employs an easy-to-interpret identifying assumption that mimics the idea of common trends in difference-in-differences designs. We illustrate our methods with data on a subsidized loan program on post-education attendance in Colombia, and offer new evidence on program effects for students with test scores away from the cutoff that determined program eligibility.
We provide a novel inferential framework to estimate the exact affine Stone index (EASI) model, and analyze welfare implications due to price changes caused by taxes. Our inferential framework is based on a non-parametric specification of the stochastic errors in the EASI incomplete demand system using Dirichlet processes. Our proposal enables to identify consumer clusters due to unobserved preference heterogeneity taking into account, censoring, simultaneous endogeneity and non-linearities. We perform an application based on a tax on electricity consumption in the Colombian economy. Our results suggest that there are four clusters due to unobserved preference heterogeneity; although 95% of our sample belongs to one cluster. This suggests that observable variables describe preferences in a good way under the EASI model in our application. We find that utilities seem to be inelastic normal goods with non-linear Engel curves. Joint predictive distributions indicate that electricity tax generates substitution effects between electricity and other non-utility goods. These distributions as well as Slutsky matrices suggest good model assessment. We find that there is a 95% probability that the equivalent variation as percentage of income of the representative household is between 0.60% to 1.49% given an approximately 1% electricity tariff increase. However, there are heterogeneous effects with higher socioeconomic strata facing more welfare losses on average. This highlights the potential remarkable welfare implications due taxation on inelastic services.
Financial advisors use questionnaires and discussions with clients to determine a suitable portfolio of assets that will allow clients to reach their investment objectives. Financial institutions assign risk ratings to each security they offer, and those ratings are used to guide clients and advisors to choose an investment portfolio risk that suits their stated risk tolerance. This paper compares client Know Your Client (KYC) profile risk allocations to their investment portfolio risk selections using a value-at-risk discrepancy methodology. Value-at-risk is used to measure elicited and revealed risk to show whether clients are over-risked or under-risked, changes in KYC risk lead to changes in portfolio configuration, and cash flow affects a clients portfolio risk. We demonstrate the effectiveness of value-at-risk at measuring clients elicited and revealed risk on a dataset provided by a private Canadian financial dealership of over $50,000$ accounts for over $27,000$ clients and $300$ advisors. By measuring both elicited and revealed risk using the same measure, we can determine how well a clients portfolio aligns with their stated goals. We believe that using value-at-risk to measure client risk provides valuable insight to advisors to ensure that their practice is KYC compliant, to better tailor their client portfolios to stated goals, communicate advice to clients to either align their portfolios to stated goals or refresh their goals, and to monitor changes to the clients risk positions across their practice.
Within the national innovation system literature, empirical analyses are severely lacking for developing economies. Particularly, the low- and middle-income countries (LMICs) eligible for the World Banks International Development Association (IDA) support, are rarely part of any empirical discourse on growth, development, and innovation. One major issue hindering panel analyses in LMICs, and thus them being subject to any empirical discussion, is the lack of complete data availability. This work offers a new complete panel dataset with no missing values for LMICs eligible for IDAs support. I use a standard, widely respected multiple imputation technique (specifically, Predictive Mean Matching) developed by Rubin (1987). This technique respects the structure of multivariate continuous panel data at the country level. I employ this technique to create a large dataset consisting of many variables drawn from publicly available established sources. These variables, in turn, capture six crucial country-level capacities: technological capacity, financial capacity, human capital capacity, infrastructural capacity, public policy capacity, and social capacity. Such capacities are part and parcel of the National Absorptive Capacity Systems (NACS). The dataset (MSK dataset) thus produced contains data on 47 variables for 82 LMICs between 2005 and 2019. The dataset has passed a quality and reliability check and can thus be used for comparative analyses of national absorptive capacities and development, transition, and convergence analyses among LMICs.