ﻻ يوجد ملخص باللغة العربية
We consider the problem of evaluating the quality of startup companies. This can be quite challenging due to the rarity of successful startup companies and the complexity of factors which impact such success. In this work we collect data on tens of thousands of startup companies, their performance, the backgrounds of their founders, and their investors. We develop a novel model for the success of a startup company based on the first passage time of a Brownian motion. The drift and diffusion of the Brownian motion associated with a startup company are a function of features based its sector, founders, and initial investors. All features are calculated using our massive dataset. Using a Bayesian approach, we are able to obtain quantitative insights about the features of successful startup companies from our model. To test the performance of our model, we use it to build a portfolio of companies where the goal is to maximize the probability of having at least one company achieve an exit (IPO or acquisition), which we refer to as winning. This $textit{picking winners}$ framework is very general and can be used to model many problems with low probability, high reward outcomes, such as pharmaceutical companies choosing drugs to develop or studios selecting movies to produce. We frame the construction of a picking winners portfolio as a combinatorial optimization problem and show that a greedy solution has strong performance guarantees. We apply the picking winners framework to the problem of choosing a portfolio of startup companies. Using our model for the exit probabilities, we are able to construct out of sample portfolios which achieve exit rates as high as 60%, which is nearly double that of top venture capital firms.
Air pollution is a major risk factor for global health, with both ambient and household air pollution contributing substantial components of the overall global disease burden. One of the key drivers of adverse health effects is fine particulate matte
The Argo data is a modern oceanography dataset that provides unprecedented global coverage of temperature and salinity measurements in the upper 2,000 meters of depth of the ocean. We study the Argo data from the perspective of functional data analys
Recent data-driven approaches have shown great potential in early prediction of battery cycle life by utilizing features from the discharge voltage curve. However, these studies caution that data-driven approaches must be combined with specific desig
The analysis of the intraday dynamics of correlations among high-frequency returns is challenging due to the presence of asynchronous trading and market microstructure noise. Both effects may lead to significant data reduction and may severely undere
We consider the problem of selecting a portfolio of entries of fixed cardinality for contests with top-heavy payoff structures, i.e. most of the winnings go to the top-ranked entries. This framework is general and can be used to model a variety of pr