Exploring Data Splitting Strategies for the Evaluation of Recommendation Models

101 0 0.0 ( 0 )

Download Cite

Added by Zaiqiao Meng

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Zaiqiao Meng - Richard McCreadie - Craig Macdonald

Information Retrieval

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Effective methodologies for evaluating recommender systems are critical, so that such systems can be compared in a sound manner. A commonly overlooked aspect of recommender system evaluation is the selection of the data splitting strategy. In this paper, we both show that there is no standard splitting strategy and that the selection of splitting strategy can have a strong impact on the ranking of recommender systems. In particular, we perform experiments comparing three common splitting strategies, examining their impact over seven state-of-the-art recommendation models for two datasets. Our results demonstrate that the splitting strategy employed is an important confounding variable that can markedly alter the ranking of state-of-the-art systems, making much of the currently published literature non-comparable, even when the same dataset and metrics are used.

rate research

The Simpsons Paradox in the Offline Evaluation of Recommendation Systems

285 - Amir H. Jadidinejad , Craig Macdonald , Iadh Ounis 2021

Recommendation systems are often evaluated based on users interactions that were collected from an existing, already deployed recommendation system. In this situation, users only provide feedback on the exposed items and they may not leave feedback on other items since they have not been exposed to them by the deployed system. As a result, the collected feedback dataset that is used to evaluate a new model is influenced by the deployed system, as a form of closed loop feedback. In this paper, we show that the typical offline evaluation of recommender systems suffers from the so-called Simpsons paradox. Simpsons paradox is the name given to a phenomenon observed when a significant trend appears in several different sub-populations of observational data but disappears or is even reversed when these sub-populations are combined together. Our in-depth experiments based on stratified sampling reveal that a very small minority of items that are frequently exposed by the deployed system plays a confounding factor in the offline evaluation of recommendation systems. In addition, we propose a novel evaluation methodology that takes into account the confounder, i.e the deployed systems characteristics. Using the relative comparison of many recommendation models as in the typical offline evaluation of recommender systems, and based on the Kendall rank correlation coefficient, we show that our proposed evaluation methodology exhibits statistically significant improvements of 14% and 40% on the examined open loop datasets (Yahoo! and Coat), respectively, in reflecting the true ranking of systems with an open loop (randomised) evaluation in comparison to the standard evaluation.

Information Retrieval

Hybrid Collaborative Filtering Models for Clinical Search Recommendation

105 - Zhiyun Ren , Bo Peng , Titus K. Schleyer 2020

With increasing and extensive use of electronic health records, clinicians are often under time pressure when they need to retrieve important information efficiently among large amounts of patients health records in clinics. While a search function can be a useful alternative to browsing through a patients record, it is cumbersome for clinicians to search repeatedly for the same or similar information on similar patients. Under such circumstances, there is a critical need to build effective recommender systems that can generate accurate search term recommendations for clinicians. In this manuscript, we developed a hybrid collaborative filtering model using patients encounter and search term information to recommend the next search terms for clinicians to retrieve important information fast in clinics. For each patient, the model will recommend terms that either have high co-occurrence frequencies with his/her most recent ICD codes or are highly relevant to the most recent search terms on this patient. We have conducted comprehensive experiments to evaluate the proposed model, and the experimental results demonstrate that our model can outperform all the state-of-the-art baseline methods for top-N search term recommendation on different datasets.

Information Retrieval Machine Learning

Comprehensive Empirical Evaluation of Deep Learning Approaches for Session-based Recommendation in E-Commerce

270 - Mohamed Maher 2020

Boosting sales of e-commerce services is guaranteed once users find more matching items to their interests in a short time. Consequently, recommendation systems have become a crucial part of any successful e-commerce services. Although various recommendation techniques could be used in e-commerce, a considerable amount of attention has been drawn to session-based recommendation systems during the recent few years. This growing interest is due to the security concerns in collecting personalized user behavior data, especially after the recent general data protection regulations. In this work, we present a comprehensive evaluation of the state-of-the-art deep learning approaches used in the session-based recommendation. In session-based recommendation, a recommendation system counts on the sequence of events made by a user within the same session to predict and endorse other items that are more likely to correlate with his/her preferences. Our extensive experiments investigate baseline techniques (textit{e.g.,} nearest neighbors and pattern mining algorithms) and deep learning approaches (textit{e.g.,} recurrent neural networks, graph neural networks, and attention-based networks). Our evaluations show that advanced neural-based models and session-based nearest neighbor algorithms outperform the baseline techniques in most of the scenarios. However, we found that these models suffer more in case of long sessions when there exists drift in user interests, and when there is no enough data to model different items correctly during training. Our study suggests that using hybrid models of different approaches combined with baseline algorithms could lead to substantial results in session-based recommendations based on dataset characteristics. We also discuss the drawbacks of current session-based recommendation algorithms and further open research directions in this field.

Information Retrieval Computers and Society Multimedia

Context-Aware Attention-Based Data Augmentation for POI Recommendation

97 - Yang Li , Yadan Luo , Zheng Zhang 2021

With the rapid growth of location-based social networks (LBSNs), Point-Of-Interest (POI) recommendation has been broadly studied in this decade. Recently, the next POI recommendation, a natural extension of POI recommendation, has attracted much attention. It aims at suggesting the next POI to a user in spatial and temporal context, which is a practical yet challenging task in various applications. Existing approaches mainly model the spatial and temporal information, and memorize historical patterns through users trajectories for recommendation. However, they suffer from the negative impact of missing and irregular check-in data, which significantly influences the model performance. In this paper, we propose an attention-based sequence-to-sequence generative model, namely POI-Augmentation Seq2Seq (PA-Seq2Seq), to address the sparsity of training set by making check-in records to be evenly-spaced. Specifically, the encoder summarises each check-in sequence and the decoder predicts the possible missing check-ins based on the encoded information. In order to learn time-aware correlation among user history, we employ local attention mechanism to help the decoder focus on a specific range of context information when predicting a certain missing check-in point. Extensive experiments have been conducted on two real-world check-in datasets, Gowalla and Brightkite, for performance and effectiveness evaluation.

Information Retrieval

Overcoming Data Sparsity in Group Recommendation

271 - Hongzhi Yin , Qinyong Wang , Kai Zheng 2020

It has been an important task for recommender systems to suggest satisfying activities to a group of users in peoples daily social life. The major challenge in this task is how to aggregate personal preferences of group members to infer the decision of a group. Conventional group recommendation methods applied a predefined strategy for preference aggregation. However, these static strategies are too simple to model the real and complex process of group decision-making, especially for occasional groups which are formed ad-hoc. Moreover, group members should have non-uniform influences or weights in a group, and the weight of a user can be varied in different groups. Therefore, an ideal group recommender system should be able to accurately learn not only users personal preferences but also the preference aggregation strategy from data. In this paper, we propose a novel end-to-end group recommender system named CAGR (short for Centrality Aware Group Recommender), which takes Bipartite Graph Embedding Model (BGEM), the self-attention mechanism and Graph Convolutional Networks (GCNs) as basic building blocks to learn group and user representations in a unified way. Specifically, we first extend BGEM to model group-item interactions, and then in order to overcome the limitation and sparsity of the interaction data generated by occasional groups, we propose a self-attentive mechanism to represent groups based on the group members. In addition, to overcome the sparsity issue of user-item interaction data, we leverage the user social networks to enhance user representation learning, obtaining centrality-aware user representations. We create three large-scale benchmark datasets and conduct extensive experiments on them. The experimental results show the superiority of our proposed CAGR by comparing it with state-of-the-art group recommender models.

Information Retrieval Machine Learning