Do you want to publish a course? Click here

Limits of PageRank-based ranking methods in sports data

60   0   0.0 ( 0 )
 Added by Matus Medo
 Publication date 2020
and research's language is English




Ask ChatGPT about the research

While PageRank has been extensively used to rank sport tournament participants (teams or individuals), its superiority over simpler ranking methods has been never clearly demonstrated. We use sports results from 18 major leagues to calibrate a state-of-art model for synthetic sports results. Model data are then used to assess the ranking performance of PageRank in a controlled setting. We find that PageRank outperforms the benchmark ranking by the number of wins only when a small fraction of all games have been played. Increased randomness in the data, such as intrinsic randomness of outcomes or advantage of home teams, further reduces the range of PageRanks superiority. We propose a new PageRank variant which outperforms PageRank in all evaluated settings, yet shares its sensitivity to increased randomness in the data. Our main findings are confirmed by evaluating the ranking algorithms on real data. Our work demonstrates the danger of using novel metrics and algorithms without considering their limits of applicability.



rate research

Read More

Methods for ranking the importance of nodes in a network have a rich history in machine learning and across domains that analyze structured data. Recent work has evaluated these methods though the seed set expansion problem: given a subset $S$ of nodes from a community of interest in an underlying graph, can we reliably identify the rest of the community? We start from the observation that the most widely used techniques for this problem, personalized PageRank and heat kernel methods, operate in the space of landing probabilities of a random walk rooted at the seed set, ranking nodes according to weighted sums of landing probabilities of different length walks. Both schemes, however, lack an a priori relationship to the seed set objective. In this work we develop a principled framework for evaluating ranking methods by studying seed set expansion applied to the stochastic block model. We derive the optimal gradient for separating the landing probabilities of two classes in a stochastic block model, and find, surprisingly, that under reasonable assumptions the gradient is asymptotically equivalent to personalized PageRank for a specific choice of the PageRank parameter $alpha$ that depends on the block model parameters. This connection provides a novel formal motivation for the success of personalized PageRank in seed set expansion and node ranking generally. We use this connection to propose more advanced techniques incorporating higher moments of landing probabilities; our advanced methods exhibit greatly improved performance despite being simple linear classification rules, and are even competitive with belief propagation.
Rankings of people and items has been highly used in selection-making, match-making, and recommendation algorithms that have been deployed on ranging of platforms from employment websites to searching tools. The ranking position of a candidate affects the amount of opportunities received by the ranked candidate. It has been observed in several works that the ranking of candidates based on their score can be biased for candidates belonging to the minority community. In recent works, the fairness-aware representative ranking was proposed for computing fairness-aware re-ranking of results. The proposed algorithm achieves the desired distribution of top-ranked results with respect to one or more protected attributes. In this work, we highlight the bias in fairness-aware representative ranking for an individual as well as for a group if the group is sub-active on the platform. We define individual unfairness and group unfairness and propose methods to generate ideal individual and group fair representative ranking if the universal representation ratio is known or unknown. The simulation results show the quantified analysis of fairness in the proposed solutions. The paper is concluded with open challenges and further directions.
We propose an automated and unsupervised methodology for a novel summarization of group behavior based on content preference. We show that graph theoretical community evolution (based on similarity of user preference for content) is effective in indexing these dynamics. Combined with text analysis that targets automatically-identified representative content for each community, our method produces a novel multi-layered representation of evolving group behavior. We demonstrate this methodology in the context of political discourse on a social news site with data that spans more than four years and find coexisting political leanings over extended periods and a disruptive external event that lead to a significant reorganization of existing patterns. Finally, where there exists no ground truth, we propose a new evaluation approach by using entropy measures as evidence of coherence along the evolution path of these groups. This methodology is valuable to designers and managers of online forums in need of granular analytics of user activity, as well as to researchers in social and political sciences who wish to extend their inquiries to large-scale data available on the web.
In order to accomplish complex tasks, it is often necessary to compose a team consisting of experts with diverse competencies. However, for proper functioning, it is also preferable that a team be socially cohesive. A team recommendation system, which facilitates the search for potential team members can be of great help both for (i) individuals who need to seek out collaborators and (ii) managers who need to build a team for some specific tasks. A decision support system which readily helps summarize such metrics, and possibly rank the teams in a personalized manner according to the end users preferences, can be a great tool to navigate what would otherwise be an information avalanche. In this work we present a general framework of how to compose such subsystems together to build a composite team recommendation system, and instantiate it for a case study of academic teams.
Wikipedia is a huge global repository of human knowledge, that can be leveraged to investigate interwinements between cultures. With this aim, we apply methods of Markov chains and Google matrix, for the analysis of the hyperlink networks of 24 Wikipedia language editions, and rank all their articles by PageRank, 2DRank and CheiRank algorithms. Using automatic extraction of people names, we obtain the top 100 historical figures, for each edition and for each algorithm. We investigate their spatial, temporal, and gender distributions in dependence of their cultural origins. Our study demonstrates not only the existence of skewness with local figures, mainly recognized only in their own cultures, but also the existence of global historical figures appearing in a large number of editions. By determining the birth time and place of these persons, we perform an analysis of the evolution of such figures through 35 centuries of human history for each language, thus recovering interactions and entanglement of cultures over time. We also obtain the distributions of historical figures over world countries, highlighting geographical aspects of cross-cultural links. Considering historical figures who appear in multiple editions as interactions between cultures, we construct a network of cultures and identify the most influential cultures according to this network.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا