ترغب بنشر مسار تعليمي؟ اضغط هنا

A Multi-Modal States based Vehicle Descriptor and Dilated Convolutional Social Pooling for Vehicle Trajectory Prediction

103   0   0.0 ( 0 )
 نشر من قبل Huimin Zhang
 تاريخ النشر 2020
والبحث باللغة English




اسأل ChatGPT حول البحث

Precise trajectory prediction of surrounding vehicles is critical for decision-making of autonomous vehicles and learning-based approaches are well recognized for the robustness. However, state-of-the-art learning-based methods ignore 1) the feasibility of the vehicles multi-modal state information for prediction and 2) the mutual exclusive relationship between the global traffic scene receptive fields and the local position resolution when modeling vehicles interactions, which may influence prediction accuracy. Therefore, we propose a vehicle-descriptor based LSTM model with the dilated convolutional social pooling (VD+DCS-LSTM) to cope with the above issues. First, each vehicles multi-modal state information is employed as our models input and a new vehicle descriptor encoded by stacked sparse auto-encoders is proposed to reflect the deep interactive relationships between various states, achieving the optimal feature extraction and effective use of multi-modal inputs. Secondly, the LSTM encoder is used to encode the historical sequences composed of the vehicle descriptor and a novel dilated convolutional social pooling is proposed to improve modeling vehicles spatial interactions. Thirdly, the LSTM decoder is used to predict the probability distribution of future trajectories based on maneuvers. The validity of the overall model was verified over the NGSIM US-101 and I-80 datasets and our method outperforms the latest benchmark.



قيم البحث

اقرأ أيضاً

With the increasing deployment of diverse positioning devices and location-based services, a huge amount of spatial and temporal information has been collected and accumulated as trajectory data. Among many applications, trajectory-based location pre diction is gaining increasing attention because of its potential to improve the performance of many applications in multiple domains. This research focuses on trajectory sequence prediction methods using trajectory data obtained from the vehicles in urban traffic network. As Recurrent Neural Network(RNN) model is previously proposed, we propose an improved method of Attention-based Recurrent Neural Network model(ARNN) for urban vehicle trajectory prediction. We introduce attention mechanism into urban vehicle trajectory prediction to explain the impact of network-level traffic state information. The model is evaluated using the Bluetooth data of private vehicles collected in Brisbane, Australia with 5 metrics which are widely used in the sequence modeling. The proposed ARNN model shows significant performance improvement compared to the existing RNN models considering not only the cells to be visited but also the alignment of the cells in sequence.
This paper presents a novel vehicle motion forecasting method based on multi-head attention. It produces joint forecasts for all vehicles on a road scene as sequences of multi-modal probability density functions of their positions. Its architecture u ses multi-head attention to account for complete interactions between all vehicles, and long short-term memory layers for encoding and forecasting. It relies solely on vehicle position tracks, does not need maneuver definitions, and does not represent the scene with a spatial grid. This allows it to be more versatile than similar model while combining any forecasting capabilities, namely joint forecast with interactions, uncertainty estimation, and multi-modality. The resulting prediction likelihood outperforms state-of-the-art models on the same dataset.
In this paper, we present a three-dimensional (3D) non-wide-sense stationary (non-WSS) wideband geometry-based channel model for vehicle-to-vehicle (V2V) communication environments. We introduce a two-cylinder model to describe moving vehicles as wel l as multiple confocal semi-ellipsoid models to depict stationary roadside scenarios. The received signal is constructed as a sum of the line-of-sight (LoS), single-, and double-bounced rays with different energies. Accordingly, the proposed channel model is sufficient for depicting a wide variety of V2V environments, such as macro-, micro-, and picocells. The relative movement between the mobile transmitter (MT) and mobile receiver (MR) results in time-variant geometric statistics that make our channel model non-stationary. Using this channel model, the proposed channel statistics, i.e., the time-variant space correlation functions (CFs), frequency CFs, and corresponding Doppler power spectral density (PSD), were studied for different relative moving time instants. The numerical results demonstrate that the proposed 3D non-WSS wideband channel model is practical for characterizing real V2V channels.
This paper considers the problem of multi-modal future trajectory forecast with ranking. Here, multi-modality and ranking refer to the multiple plausible path predictions and the confidence in those predictions, respectively. We propose Social-STAGE, Social interaction-aware Spatio-Temporal multi-Attention Graph convolution network with novel Evaluation for multi-modality. Our main contributions include analysis and formulation of multi-modality with ranking using interaction and multi-attention, and introduction of new metrics to evaluate the diversity and associated confidence of multi-modal predictions. We evaluate our approach on existing public datasets ETH and UCY and show that the proposed algorithm outperforms the state of the arts on these datasets.
Fast, collision-free motion through unknown environments remains a challenging problem for robotic systems. In these situations, the robots ability to reason about its future motion is often severely limited by sensor field of view (FOV). By contrast , biological systems routinely make decisions by taking into consideration what might exist beyond their FOV based on prior experience. In this paper, we present an approach for predicting occupancy map representations of sensor data for future robot motions using deep neural networks. We evaluate several deep network architectures, including purely generative and adversarial models. Testing on both simulated and real environments we demonstrated performance both qualitatively and quantitatively, with SSIM similarity measure up to 0.899. We showed that it is possible to make predictions about occupied space beyond the physical robots FOV from simulated training data. In the future, this method will allow robots to navigate through unknown environments in a faster, safer manner.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا