A Statistical Analysis of Noisy Crowdsourced Weather Data


Abstract in English

Spatial prediction of weather-elements like temperature, precipitation, and barometric pressure are generally based on satellite imagery or data collected at ground-stations. None of these data provide information at a more granular or hyper-local resolution. On the other hand, crowdsourced weather data, which are captured by sensors installed on mobile devices and gathered by weather-related mobile apps like WeatherSignal and AccuWeather, can serve as potential data sources for analyzing environmental processes at a hyper-local resolution. However, due to the low quality of the sensors and the non-laboratory environment, the quality of the observations in crowdsourced data is compromised. This paper describes methods to improve hyper-local spatial prediction using this varying-quality noisy crowdsourced information. We introduce a reliability metric, namely Veracity Score (VS), to assess the quality of the crowdsourced observations using a coarser, but high-quality, reference data. A VS-based methodology to analyze noisy spatial data is proposed and evaluated through extensive simulations. The merits of the proposed approach are illustrated through case studies analyzing crowdsourced daily average ambient temperature readings for one day in the contiguous United States.

Download