No Arabic abstract
We use sequential large-scale crawl data to empirically investigate and validate the dynamics that underlie the evolution of the structure of the web. We find that the overall structure of the web is defined by an intricate interplay between experience or entitlement of the pages (as measured by the number of inbound hyperlinks a page already has), inherent talent or fitness of the pages (as measured by the likelihood that someone visiting the page would give a hyperlink to it), and the continual high rates of birth and death of pages on the web. We find that the web is conservative in judging talent and the overall fitness distribution is exponential, showing low variability. The small variance in talent, however, is enough to lead to experience distributions with high variance: The preferential attachment mechanism amplifies these small biases and leads to heavy-tailed power-law (PL) inbound degree distributions over all pages, as well as over pages that are of the same age. The balancing act between experience and talent on the web allows newly introduced pages with novel and interesting content to grow quickly and surpass older pages. In this regard, it is much like what we observe in high-mobility and meritocratic societies: People with entitlement continue to have access to the best resources, but there is just enough screening for fitness that allows for talented winners to emerge and join the ranks of the leaders. Finally, we show that the fitness estimates have potential practical applications in ranking query results.
We present a method for accurately predicting the long time popularity of online content from early measurements of user access. Using two content sharing portals, Youtube and Digg, we show that by modeling the accrual of views and votes on content offered by these services we can predict the long-term dynamics of individual submissions from initial data. In the case of Digg, measuring access to given stories during the first two hours allows us to forecast their popularity 30 days ahead with remarkable accuracy, while downloads of Youtube videos need to be followed for 10 days to attain the same performance. The differing time scales of the predictions are shown to be due to differences in how content is consumed on the two portals: Digg stories quickly become outdated, while Youtube videos are still found long after they are initially submitted to the portal. We show that predictions are more accurate for submissions for which attention decays quickly, whereas predictions for evergreen content will be prone to larger errors.
Early analyses revealed that dark web marketplaces (DWMs) started offering COVID-19 related products (e.g., masks and COVID-19 tests) as soon as the current pandemic started, when these goods were in shortage in the traditional economy. Here, we broaden the scope and depth of previous investigations by analysing 194 DWMs until July 2021, including the crucial period in which vaccines became available, and by considering the wider impact of the pandemic on DWMs. First, we focus on vaccines. We find 250 listings offering approved vaccines, like Pfizer/BioNTech and AstraZeneca, as well as vendors offering fabricated proofs of vaccination and COVID-19 passports. Second, we consider COVID-19 related products. We reveal that, as the regular economy has become able to satisfy the demand of these goods, DWMs have decreased their offer. Third, we analyse the profile of vendors of COVID-19 related products and vaccines. We find that most of them are specialized in a single type of listings and are willing to ship worldwide. Finally, we consider a broader set of listings simply mentioning COVID-19. Among 10,330 such listings, we show that recreational drugs are the most affected among traditional DWMs product, with COVID-19 mentions steadily increasing since March 2020. We anticipate that our effort is of interest to researchers, practitioners, and law enforcement agencies focused on the study and safeguard of public health.
The COVID-19 pandemic has reshaped the demand for goods and services worldwide. The combination of a public health emergency, economic distress, and misinformation-driven panic have pushed customers and vendors towards the shadow economy. In particular, dark web marketplaces (DWMs), commercial websites accessible via free software, have gained significant popularity. Here, we analyse 851,199 listings extracted from 30 DWMs between January 1, 2020 and November 16, 2020. We identify 788 listings directly related to COVID-19 products and monitor the temporal evolution of product categories including Personal Protective Equipment (PPE), medicines (e.g., hydroxyclorochine), and medical frauds. Finally, we compare trends in their temporal evolution with variations in public attention, as measured by Twitter posts and Wikipedia page visits. We reveal how the online shadow economy has evolved during the COVID-19 pandemic and highlight the importance of a continuous monitoring of DWMs, especially now that real vaccines are available and in short supply. We anticipate our analysis will be of interest both to researchers and public agencies focused on the protection of public health.
Development of efficient business process models and determination of their characteristic properties are subject of intense interdisciplinary research. Here, we consider a business process model as a directed graph. Its nodes correspond to the units identified by the modeler and the link direction indicates the causal dependencies between units. It is of primary interest to obtain the stationary flow on such a directed graph, which corresponds to the steady-state of a firm during the business process. Following the ideas developed recently for the World Wide Web, we construct the Google matrix for our business process model and analyze its spectral properties. The importance of nodes is characterized by Page-Rank and recently proposed CheiRank and 2DRank, respectively. The results show that this two-dimensional ranking gives a significant information about the influence and communication properties of business model units. We argue that the Google matrix method, described here, provides a new efficient tool helping companies to make their decisions on how to evolve in the exceedingly dynamic global market.
The subject of collective attention is central to an information age where millions of people are inundated with daily messages. It is thus of interest to understand how attention to novel items propagates and eventually fades among large populations. We have analyzed the dynamics of collective attention among one million users of an interactive website -- texttt{digg.com} -- devoted to thousands of novel news stories. The observations can be described by a dynamical model characterized by a single novelty factor. Our measurements indicate that novelty within groups decays with a stretched-exponential law, suggesting the existence of a natural time scale over which attention fades.