No Arabic abstract
Our field has recently witnessed an arms race of neural network-based trajectory predictors. While these predictors are at the core of many applications such as autonomous navigation or pedestrian flow simulations, their adversarial robustness has not been carefully studied. In this paper, we introduce a socially-attended attack to assess the social understanding of prediction models in terms of collision avoidance. An attack is a small yet carefully-crafted perturbations to fail predictors. Technically, we define collision as a failure mode of the output, and propose hard- and soft-attention mechanisms to guide our attack. Thanks to our attack, we shed light on the limitations of the current models in terms of their social understanding. We demonstrate the strengths of our method on the recent trajectory prediction models. Finally, we show that our attack can be employed to increase the social understanding of state-of-the-art models. The code is available online: https://s-attack.github.io/
Trajectory prediction is a critical technique in the navigation of robots and autonomous vehicles. However, the complex traffic and dynamic uncertainties yield challenges in the effectiveness and robustness in modeling. We purpose a data-driven approach socially aware Kalman neural networks (SAKNN) where the interaction layer and the Kalman layer are embedded in the architecture, resulting in a class of architectures with huge potential to directly learn from high variance sensor input and robustly generate low variance outcomes. The evaluation of our approach on NGSIM dataset demonstrates that SAKNN performs state-of-the-art on prediction effectiveness in a relatively long-term horizon and significantly improves the signal-to-noise ratio of the predicted signal.
Smooth and seamless robot navigation while interacting with humans depends on predicting human movements. Forecasting such human dynamics often involves modeling human trajectories (global motion) or detailed body joint movements (local motion). Prior work typically tackled local and global human movements separately. In this paper, we propose a novel framework to tackle both tasks of human motion (or trajectory) and body skeleton pose forecasting in a unified end-to-end pipeline. To deal with this real-world problem, we consider incorporating both scene and social contexts, as critical clues for this prediction task, into our proposed framework. To this end, we first couple these two tasks by i) encoding their history using a shared Gated Recurrent Unit (GRU) encoder and ii) applying a metric as loss, which measures the source of errors in each task jointly as a single distance. Then, we incorporate the scene context by encoding a spatio-temporal representation of the video data. We also include social clues by generating a joint feature representation from motion and pose of all individuals from the scene using a social pooling layer. Finally, we use a GRU based decoder to forecast both motion and skeleton pose. We demonstrate that our proposed framework achieves a superior performance compared to several baselines on two social datasets.
Self-supervised learning (SSL), which can automatically generate ground-truth samples from raw data, holds vast potential to improve recommender systems. Most existing SSL-based methods perturb the raw data graph with uniform node/edge dropout to generate new data views and then conduct the self-discrimination based contrastive learning over different views to learn generalizable representations. Under this scheme, only a bijective mapping is built between nodes in two different views, which means that the self-supervision signals from other nodes are being neglected. Due to the widely observed homophily in recommender systems, we argue that the supervisory signals from other nodes are also highly likely to benefit the representation learning for recommendation. To capture these signals, a general socially-aware SSL framework that integrates tri-training is proposed in this paper. Technically, our framework first augments the user data views with the user social information. And then under the regime of tri-training for multi-view encoding, the framework builds three graph encoders (one for recommendation) upon the augmented views and iteratively improves each encoder with self-supervision signals from other users, generated by the other two encoders. Since the tri-training operates on the augmented views of the same data sources for self-supervision signals, we name it self-supervised tri-training. Extensive experiments on multiple real-world datasets consistently validate the effectiveness of the self-supervised tri-training framework for improving recommendation. The code is released at https://github.com/Coder-Yu/QRec.
As a result of the importance of academic collaboration at smart conferences, various researchers have utilized recommender systems to generate effective recommendations for participants. Recent research has shown that the personality traits of users can be used as innovative entities for effective recommendations. Nevertheless, subjective perceptions involving the personality of participants at smart conferences are quite rare and havent gained much attention. Inspired by the personality and social characteristics of users, we present an algorithm called Socially and Personality Aware Recommendation of Participants (SPARP). Our recommendation methodology hybridizes the computations of similar interpersonal relationships and personality traits among participants. SPARP models the personality and social characteristic profiles of participants at a smart conference. By combining the above recommendation entities, SPARP then recommends participants to each other for effective collaborations. We evaluate SPARP using a relevant dataset. Experimental results confirm that SPARP is reliable and outperforms other state-of-the-art methods.
This research addresses recommending presentation sessions at smart conferences to participants. We propose a venue recommendation algorithm, Socially-Aware Recommendation of Venues and Environments (SARVE). SARVE computes correlation and social characteristic information of conference participants. In order to model a recommendation process using distributed community detection, SARVE further integrates the current context of both the smart conference community and participants. SARVE recommends presentation sessions that may be of high interest to each participant. We evaluate SARVE using a real world dataset. In our experiments, we compare SARVE to two related state-of-the-art methods, namely: Context-Aware Mobile Recommendation Services (CAMRS) and Conference Navigator (Recommender) Model. Our experimental results show that in terms of the utilized evaluation metrics: precision, recall, and f-measure, SARVE achieves more reliable and favorable social (relations and context) recommendation results.