No Arabic abstract
Algorithms that favor popular items are used to help us select among many choices, from engaging articles on a social media news feed to songs and books that others have purchased, and from top-raked search engine results to highly-cited scientific papers. The goal of these algorithms is to identify high-quality items such as reliable news, beautiful movies, prestigious information sources, and important discoveries --- in short, high-quality content should rank at the top. Prior work has shown that choosing what is popular may amplify random fluctuations and ultimately lead to sub-optimal rankings. Nonetheless, it is often assumed that recommending what is popular will help high-quality content bubble up in practice. Here we identify the conditions in which popularity may be a viable proxy for quality content by studying a simple model of cultural market endowed with an intrinsic notion of quality. A parameter representing the cognitive cost of exploration controls the critical trade-off between quality and popularity. We find a regime of intermediate exploration cost where an optimal balance exists, such that choosing what is popular actually promotes high-quality items to the top. Outside of these limits, however, popularity bias is more likely to hinder quality. These findings clarify the effects of algorithmic popularity bias on quality outcomes, and may inform the design of more principled mechanisms for techno-social cultural markets.
There has been rapidly growing interest in the use of algorithms in hiring, especially as a means to address or mitigate bias. Yet, to date, little is known about how these methods are used in practice. How are algorithmic assessments built, validated, and examined for bias? In this work, we document and analyze the claims and practices of companies offering algorithms for employment assessment. In particular, we identify vendors of algorithmic pre-employment assessments (i.e., algorithms to screen candidates), document what they have disclosed about their development and validation procedures, and evaluate their practices, focusing particularly on efforts to detect and mitigate bias. Our analysis considers both technical and legal perspectives. Technically, we consider the various choices vendors make regarding data collection and prediction targets, and explore the risks and trade-offs that these choices pose. We also discuss how algorithmic de-biasing techniques interface with, and create challenges for, antidiscrimination law.
Shared e-scooters have become a familiar sight in many cities around the world. Yet the role they play in the mobility space is still poorly understood. This paper presents a study of the use of Bird e-scooters in the city of Atlanta. Starting with raw data which contains the location of available Birds over time, the study identifies trips and leverages the Google Places API to associate each trip origin and destination with a Point of Interest (POI). The resulting trip data is then used to understand the role of e-scooters in mobility by clustering trips using 10 collections of POIs, including business, food and recreation, parking, transit, health, and residential. The trips between these POI clusters reveal some surprising, albeit sensible, findings about the role of e-scooters in mobility, as well as the time of the day where they are most popular.
We analyze the role that popularity and novelty play in attracting the attention of users to dynamic websites. We do so by determining the performance of three different strategies that can be utilized to maximize attention. The first one prioritizes novelty while the second emphasizes popularity. A third strategy looks myopically into the future and prioritizes stories that are expected to generate the most clicks within the next few minutes. We show that the first two strategies should be selected on the basis of the rate of novelty decay, while the third strategy performs sub-optimally in most cases. We also demonstrate that the relative performance of the first two strategies as a function of the rate of novelty decay changes abruptly around a critical value, resembling a phase transition in the physical world. 1
We present a method for accurately predicting the long time popularity of online content from early measurements of user access. Using two content sharing portals, Youtube and Digg, we show that by modeling the accrual of views and votes on content offered by these services we can predict the long-term dynamics of individual submissions from initial data. In the case of Digg, measuring access to given stories during the first two hours allows us to forecast their popularity 30 days ahead with remarkable accuracy, while downloads of Youtube videos need to be followed for 10 days to attain the same performance. The differing time scales of the predictions are shown to be due to differences in how content is consumed on the two portals: Digg stories quickly become outdated, while Youtube videos are still found long after they are initially submitted to the portal. We show that predictions are more accurate for submissions for which attention decays quickly, whereas predictions for evergreen content will be prone to larger errors.
The emergence of online political advertising has come with little regulation, allowing political advertisers on social media to avoid accountability. We analyze how transparency deficits caused by dark money and group impermanence relate to the sentiment of political ads on Facebook. We obtained 525,796 ads with FEC-registered advertisers from Facebooks ad library that ran between August-November 2018. We compare ads run by candidates, parties, and outside groups, which we classify by (i) their donor transparency (dark money or disclosed) and (ii) the groups permanence (disappearing after 2018 or re-registering). Ads run by dark money and disappearing outside groups were more negative than transparent and re-registering groups, respectively. Outside groups as a whole also ran more negative ads than candidates and parties. These results suggest that transparency for political speech is associated with advertising tone: the most negative advertising comes from organizations with less donor disclosure and permanence.