Time Series Analysis and Correlation of Subway Turnstile Usage and COVID-19 Prevalence in New York City


Abstract in English

In this paper, we show a strong correlation between turnstile usage data of the New York City subway provided by the Metropolitan Transport Authority of New York City and COVID-19 deaths and cases reported by the New York City Department of Health. The turnstile usage data not only indicate the usage of the citys subway but also peoples activity that promoted the large prevalence of COVID-19 city dwellers experienced from March to May of 2020. While this correlation is apparent, no proof has been provided before. Here we demonstrate this correlation through the application of a long short-term memory neural network. We show that the correlation of COVID-19 prevalence and deaths considers the incubation and symptomatic phases on reported deaths. Having established this correlation, we estimate the dates when the number of COVID-19 deaths and cases would approach zero after the reported number of deaths were decreasing by using the Auto-Regressive Integrated Moving Average model. We also estimate the dates when the first cases and deaths occurred by back-tracing the data sets and compare them to the reported dates.

Download