No Arabic abstract
Facebook News Feed personalization algorithm has a significant impact, on a daily basis, on the lifestyle, mood and opinion of millions of Internet users. Nonetheless, the behavior of such algorithm lacks transparency, motivating measurements, modeling and analysis in order to understand and improve its properties. In this paper, we propose a reproducible methodology encompassing measurements, an analytical model and a fairness-based News Feed design. The model leverages the versatility and analytical tractability of time-to-live (TTL) counters to capture the visibility and occupancy of publishers over a News Feed. Measurements are used to parameterize and to validate the expressive power of the proposed model. Then, we conduct a what-if analysis to assess the visibility and occupancy bias incurred by users against a baseline derived from the model. Our results indicate that a significant bias exists and it is more prominent at the top position of the News Feed. In addition, we find that the bias is non-negligible even for users that are deliberately set as neutral with respect to their political views, motivating the proposal of a novel and more transparent fairness-based News Feed design.
Classification problems have made significant progress due to the maturity of artificial intelligence (AI). However, differentiating items from categories without noticeable boundaries is still a huge challenge for machines -- which is also crucial for machines to be intelligent. In order to study the fuzzy concept on classification, we define and propose a globalness detection with the four-stage operational flow. We then demonstrate our framework on Facebook public pages inter-like graph with their geo-location. Our prediction algorithm achieves high precision (89%) and recall (88%) of local pages. We evaluate the results on both states and countries level, finding that the global node ratios are relatively high in those states (NY, CA) having large and international cities. Several global nodes examples have also been shown and studied in this paper. It is our hope that our results unveil the perfect value from every classification problem and provide a better understanding of global and local nodes in Online Social Networks (OSNs).
Online social networks have become incredibly popular in recent years, which prompts an increasing number of companies to promote their brands and products through social media. This paper presents an approach for identifying influential nodes in online social network for brand communication. We first construct a weighted network model for the users and their relationships extracted from the brand-related contents. We quantitatively measure the individual value of the nodes in the community from both the network structure and brand engagement aspects. Then an algorithm for identifying the influential nodes from the virtual brand community is proposed. The algorithm evaluates the importance of the nodes by their individual values as well as the individual values of their surrounding nodes. We extract and construct a virtual brand community for a specific brand from a real-life online social network as the dataset and empirically evaluate the proposed approach. The experimental results have shown that the proposed approach was able to identify influential nodes in online social network. We can get an identification result with higher ratio of verified users and user coverage by using the approach.
Parler is as an alternative social network promoting itself as a service that allows to speak freely and express yourself openly, without fear of being deplatformed for your views. Because of this promise, the platform become popular among users who were suspended on mainstream social networks for violating their terms of service, as well as those fearing censorship. In particular, the service was endorsed by several conservative public figures, encouraging people to migrate from traditional social networks. After the storming of the US Capitol on January 6, 2021, Parler has been progressively deplatformed, as its app was removed from Apple/Google Play stores and the website taken down by the hosting provider. This paper presents a dataset of 183M Parler posts made by 4M users between August 2018 and January 2021, as well as metadata from 13.25M user profiles. We also present a basic characterization of the dataset, which shows that the platform has witnessed large influxes of new users after being endorsed by popular figures, as well as a reaction to the 2020 US Presidential Election. We also show that discussion on the platform is dominated by conservative topics, President Trump, as well as conspiracy theories like QAnon.
Online social networks represent a popular and diverse class of social media systems. Despite this variety, each of these systems undergoes a general process of online social network assembly, which represents the complicated and heterogeneous changes that transform newly born systems into mature platforms. However, little is known about this process. For example, how much of a networks assembly is driven by simple growth? How does a networks structure change as it matures? How does network structure vary with adoption rates and user heterogeneity, and do these properties play different roles at different points in the assembly? We investigate these and other questions using a unique dataset of online connections among the roughly one million users at the first 100 colleges admitted to Facebook, captured just 20 months after its launch. We first show that different vintages and adoption rates across this population of networks reveal temporal dynamics of the assembly process, and that assembly is only loosely related to network growth. We then exploit natural experiments embedded in this dataset and complementary data obtained via Internet archaeology to show that different subnetworks matured at different rates toward similar end states. These results shed light on the processes and patterns of online social network assembly, and may facilitate more effective design for online social systems.
Activity maximization is a task of seeking a small subset of users in a given social network that makes the expected total activity benefit maximized. This is a generalization of many real applications. In this paper, we extend activity maximization problem to that under the general marketing strategy $vec{x}$, which is a $d$-dimensional vector from a lattice space and has probability $h_u(vec{x})$ to activate a node $u$ as a seed. Based on that, we propose the continuous activity maximization (CAM) problem, where the domain is continuous and the seed set we select conforms to a certain probability distribution. It is a new topic to study the problem about information diffusion under the lattice constraint, thus, we address the problem systematically here. First, we analyze the hardness of CAM and how to compute the objective function of CAM accurately and effectively. We prove this objective function is monotone, but not DR-submodular and not DR-supermodular. Then, we develop a monotone and DR-submodular lower bound and upper bound of CAM, and apply sampling techniques to design three unbiased estimators for CAM, its lower bound and upper bound. Next, adapted from IMM algorithm and sandwich approximation framework, we obtain a data-dependent approximation ratio. This process can be considered as a general method to solve those maximization problem on lattice but not DR-submodular. Last, we conduct experiments on three real-world datasets to evaluate the correctness and effectiveness of our proposed algorithms.