ﻻ يوجد ملخص باللغة العربية
A new convolutional neural network (CNN) architecture for 2D driver/passenger pose estimation and seat belt detection is proposed in this paper. The new architecture is more nimble and thus more suitable for in-vehicle monitoring tasks compared to other generic pose estimation algorithms. The new architecture, named NADS-Net, utilizes the feature pyramid network (FPN) backbone with multiple detection heads to achieve the optimal performance for driver/passenger state detection tasks. The new architecture is validated on a new data set containing video clips of 100 drivers in 50 driving sessions that are collected for this study. The detection performance is analyzed under different demographic, appearance, and illumination conditions. The results presented in this paper may provide meaningful insights for the autonomous driving research community and automotive industry for future algorithm development and data collection.
To help prevent motor vehicle accidents, there has been significant interest in finding an automated method to recognize signs of driver distraction, such as talking to passengers, fixing hair and makeup, eating and drinking, and using a mobile phone
Deep neural network (DNN) accelerators with improved energy and delay are desirable for meeting the requirements of hardware targeted for IoT and edge computing systems. Convolutional neural networks (CoNNs) belong to one of the most popular types of
Geometric deep learning has attracted significant attention in recent years, in part due to the availability of exotic data types for which traditional neural network architectures are not well suited. Our goal in this paper is to generalize convolut
Recently, channel attention mechanism has demonstrated to offer great potential in improving the performance of deep convolutional neural networks (CNNs). However, most existing methods dedicate to developing more sophisticated attention modules for
We proposed a deep learning method for interpretable diabetic retinopathy (DR) detection. The visual-interpretable feature of the proposed method is achieved by adding the regression activation map (RAM) after the global averaging pooling layer of th