Scalable Traffic Predictive Analysis using GPU in Big Data


Abstract in English

The paper adopts parallel computing systems for predictive analysis in both CPU and GPU leveraging Spark Big Data platform. The traffic dataset is adopted to predict the traffic jams in Los Angeles County. It is collected from a popular platform in the USA for tracking information on the road using the device information and reports shared by the users. Large-scale traffic data set can be stored and processed using both GPU and CPU in this Scalable Big Data systems. The major contribution of this paper is to improve the performance of machine learning in distributed parallel computing systems with GPU to predict the traffic congestion. We show that the parallel computing can be achieve using both GPU and CPU with the existing Apache Spark platform. Our method can be applicable to other large scale datasets in different domains. The process modeling, as well as results, are interpreted using computing time and metrics: AUC, Precision and Recall. It should help the traffic management in Smart City.

Download