No Arabic abstract
Existing methods to estimate the prevalence of chronic hepatitis C (HCV) in New York City (NYC) are limited in scope and fail to assess hard-to-reach subpopulations with highest risk such as injecting drug users (IDUs). To address these limitations, we employ a Bayesian multi-parameter evidence synthesis model to systematically combine multiple sources of data, account for bias in certain data sources, and provide unbiased HCV prevalence estimates with associated uncertainty. Our approach improves on previous estimates by explicitly accounting for injecting drug use and including data from high-risk subpopulations such as the incarcerated, and is more inclusive, utilizing ten NYC data sources. In addition, we derive two new equations to allow age at first injecting drug use data for former and current IDUs to be incorporated into the Bayesian evidence synthesis, a first for this type of model. Our estimated overall HCV prevalence as of 2012 among NYC adults aged 20-59 years is 2.78% (95% CI 2.61-2.94%), which represents between 124,900 and 140,000 chronic HCV cases. These estimates suggest that HCV prevalence in NYC is higher than previously indicated from household surveys (2.2%) and the surveillance system (2.37%), and that HCV transmission is increasing among young injecting adults in NYC. An ancillary benefit from our results is an estimate of current IDUs aged 20-59 in NYC: 0.58% or 27,600 individuals.
Accurate estimates of subnational populations are important for policy formulation and monitoring population health indicators. For example, estimates of the number of women of reproductive age are important to understand the population at risk to maternal mortality and unmet need for contraception. However, in many low-income countries, data on population counts and components of population change are limited, and so levels and trends subnationally are unclear. We present a Bayesian constrained cohort component model for the estimation and projection of subnational populations. The model builds on a cohort component projection framework, incorporates census data and estimates from the United Nations World Population Prospects, and uses characteristic mortality schedules to obtain estimates of population counts and the components of population change, including internal migration. The data required as inputs to the model are minimal and available across a wide range of countries, including most low-income countries. The model is applied to estimate and project populations by county in Kenya for 1979-2019, and validated against the 2019 Kenyan census.
The rise of Uber as the global alternative taxi operator has attracted a lot of interest recently. Aside from the media headlines which discuss the new phenomenon, e.g. on how it has disrupted the traditional transportation industry, policy makers, economists, citizens and scientists have engaged in a discussion that is centred around the means to integrate the new generation of the sharing economy services in urban ecosystems. In this work, we aim to shed new light on the discussion, by taking advantage of a publicly available longitudinal dataset that describes the mobility of yellow taxis in New York City. In addition to movement, this data contains information on the fares paid by the taxi customers for each trip. As a result we are given the opportunity to provide a first head to head comparison between the iconic yellow taxi and its modern competitor, Uber, in one of the worlds largest metropolitan centres. We identify situations when Uber X, the cheapest version of the Uber taxi service, tends to be more expensive than yellow taxis for the same journey. We also demonstrate how Ubers economic model effectively takes advantage of well known patterns in human movement. Finally, we take our analysis a step further by proposing a new mobile application that compares taxi prices in the city to facilitate travellers taxi choices, hoping to ultimately to lead to a reduction of commuter costs. Our study provides a case on how big datasets that become public can improve urban services for consumers by offering the opportunity for transparency in economic sectors that lack up to date regulations.
In this paper, we show a strong correlation between turnstile usage data of the New York City subway provided by the Metropolitan Transport Authority of New York City and COVID-19 deaths and cases reported by the New York City Department of Health. The turnstile usage data not only indicate the usage of the citys subway but also peoples activity that promoted the large prevalence of COVID-19 city dwellers experienced from March to May of 2020. While this correlation is apparent, no proof has been provided before. Here we demonstrate this correlation through the application of a long short-term memory neural network. We show that the correlation of COVID-19 prevalence and deaths considers the incubation and symptomatic phases on reported deaths. Having established this correlation, we estimate the dates when the number of COVID-19 deaths and cases would approach zero after the reported number of deaths were decreasing by using the Auto-Regressive Integrated Moving Average model. We also estimate the dates when the first cases and deaths occurred by back-tracing the data sets and compare them to the reported dates.
Redlining is the discriminatory practice whereby institutions avoided investment in certain neighborhoods due to their demographics. Here we explore the lasting impacts of redlining on the spread of COVID-19 in New York City (NYC). Using data available through the Home Mortgage Disclosure Act, we construct a redlining index for each NYC census tract via a multi-level logistical model. We compare this redlining index with the COVID-19 statistics for each NYC Zip Code Tabulation Area. Accurate mappings of the pandemic would aid the identification of the most vulnerable areas and permit the most effective allocation of medical resources, while reducing ethnic health disparities.
Disease mapping is the field of spatial epidemiology interested in estimating the spatial pattern in disease risk across $n$ areal units. One aim is to identify units exhibiting elevated disease risks, so that public health interventions can be made. Bayesian hierarchical models with a spatially smooth conditional autoregressive prior are used for this purpose, but they cannot identify the spatial extent of high-risk clusters. Therefore we propose a two stage solution to this problem, with the first stage being a spatially adjusted hierarchical agglomerative clustering algorithm. This algorithm is applied to data prior to the study period, and produces $n$ potential cluster structures for the disease data. The second stage fits a separate Poisson log-linear model to the study data for each cluster structure, which allows for step-changes in risk where two clusters meet. The most appropriate cluster structure is chosen by model comparison techniques, specifically by minimising the Deviance Information Criterion. The efficacy of the methodology is established by a simulation study, and is illustrated by a study of respiratory disease risk in Glasgow, Scotland.