Risk Prediction on Traffic Accidents using a Compact Neural Model for Multimodal Information Fusion over Urban Big Data

98 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Wenshan Wang

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Wenshan Wang - Su Yang -

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Predicting risk map of traffic accidents is vital for accident prevention and early planning of emergency response. Here, the challenge lies in the multimodal nature of urban big data. We propose a compact neural ensemble model to alleviate overfitting in fusing multimodal features and develop some new features such as fractal measure of road complexity in satellite images, taxi flows, POIs, and road width and connectivity in OpenStreetMap. The solution is more promising in performance than the baseline methods and the single-modality data based solutions. After visualization from a micro view, the visual patterns of the scenes related to high and low risk are revealed, providing lessons for future road design. From city point of view, the predicted risk map is close to the ground truth, and can act as the base in optimizing spatial configuration of resources for emergency response, and alarming signs. To the best of our knowledge, it is the first work to fuse visual and spatio-temporal features in traffic accident prediction while advances to bridge the gap between data mining based urban computing and computer vision based urban perception.

قيم البحث

59 - Christian S. Perone 2015

This study describes the experimental application of Machine Learning techniques to build prediction models that can assess the injury risk associated with traffic accidents. This work uses an freely available data set of traffic accident records tha t took place in the city of Porto Alegre/RS (Brazil) during the year of 2013. This study also provides an analysis of the most important attributes of a traffic accident that could produce an outcome of injury to the people involved in the accident.

التعلم الآلي الذكاء الاصطناعي

M2P2: Multimodal Persuasion Prediction using Adaptive Fusion

352 - Chongyang Bai , Haipeng Chen , Srijan Kumar 2020

Identifying persuasive speakers in an adversarial environment is a critical task. In a national election, politicians would like to have persuasive speakers campaign on their behalf. When a company faces adverse publicity, they would like to engage p ersuasive advocates for their position in the presence of adversaries who are critical of them. Debates represent a common platform for these forms of adversarial persuasion. This paper solves two problems: the Debate Outcome Prediction (DOP) problem predicts who wins a debate while the Intensity of Persuasion Prediction (IPP) problem predicts the change in the number of votes before and after a speaker speaks. Though DOP has been previously studied, we are the first to study IPP. Past studies on DOP fail to leverage two important aspects of multimodal data: 1) multiple modalities are often semantically aligned, and 2) different modalities may provide diverse information for prediction. Our M2P2 (Multimodal Persuasion Prediction) framework is the first to use multimodal (acoustic, visual, language) data to solve the IPP problem. To leverage the alignment of different modalities while maintaining the diversity of the cues they provide, M2P2 devises a novel adaptive fusion learning framework which fuses embeddings obtained from two modules -- an alignment module that extracts shared information between modalities and a heterogeneity module that learns the weights of different modalities with guidance from three separately trained unimodal reference models. We test M2P2 on the popular IQ2US dataset designed for DOP. We also introduce a new dataset called QPS (from Qipashuo, a popular Chinese debate TV show ) for IPP. M2P2 significantly outperforms 3 recent baselines on both datasets. Our code and QPS dataset can be found at http://snap.stanford.edu/m2p2/.

الرؤية الحاسوبية وتمييز الأنماط الحساب واللغة التعلم الآلي

Multiple feature fusion-based video face tracking for IoT big data

99 - Tianping Li , Zhifeng Liu , Jianping Qiao 2021

With the advancement of IoT and artificial intelligence technologies, and the need for rapid application growth in fields such as security entrance control and financial business trade, facial information processing has become an important means for achieving identity authentication and information security. In this paper, we propose a multi-feature fusion algorithm based on integral histograms and a real-time update tracking particle filtering module. First, edge and colour features are extracted, weighting methods are used to weight the colour histogram and edge features to describe facial features, and fusion of colour and edge features is made adaptive by using fusion coefficients to improve face tracking reliability. Then, the integral histogram is integrated into the particle filtering algorithm to simplify the calculation steps of complex particles. Finally, the tracking window size is adjusted in real time according to the change in the average distance from the particle centre to the edge of the current model and the initial model to reduce the drift problem and achieve stable tracking with significant changes in the target dimension. The results show that the algorithm improves video tracking accuracy, simplifies particle operation complexity, improves the speed, and has good anti-interference ability and robustness.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Sensor Fusion: Gated Recurrent Fusion to Learn Driving Behavior from Temporal Multimodal Data

90 - Athma Narayanan , Avinash Siravuru , Behzad Dariush 2019

The Tactical Driver Behavior modeling problem requires understanding of driver actions in complicated urban scenarios from a rich multi modal signals including video, LiDAR and CAN bus data streams. However, the majority of deep learning research is focused either on learning the vehicle/environment state (sensor fusion) or the driver policy (from temporal data), but not both. Learning both tasks end-to-end offers the richest distillation of knowledge, but presents challenges in formulation and successful training. In this work, we propose promising first steps in this direction. Inspired by the gating mechanisms in LSTM, we propose gated recurrent fusion units (GRFU) that learn fusion weighting and temporal weighting simultaneously. We demonstrate its superior performance over multimodal and temporal baselines in supervised regression and classification tasks, all in the realm of autonomous navigation. We note a 10% improvement in the mAP score over state-of-the-art for tactical driver behavior classification in HDD dataset and a 20% drop in overall Mean squared error for steering action regression on TORCS dataset.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Structured Prediction using cGANs with Fusion Discriminator

227 - Faisal Mahmood , Wenhao Xu , Nicholas J. Durr 2019

We propose the fusion discriminator, a single unified framework for incorporating conditional information into a generative adversarial network (GAN) for a variety of distinct structured prediction tasks, including image synthesis, semantic segmentat ion, and depth estimation. Much like commonly used convolutional neural network -- conditional Markov random field (CNN-CRF) models, the proposed method is able to enforce higher-order consistency in the model, but without being limited to a very specific class of potentials. The method is conceptually simple and flexible, and our experimental results demonstrate improvement on several diverse structured prediction tasks.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو