No Arabic abstract
The World Economic Forum (WEF) publishes annual reports on global risks which have the high impact on the worlds economy. Currently, many researchers analyze the modeling and evolution of risks. However, few studies focus on validation of the global risk networks published by the WEF. In this paper, we first create a risk knowledge graph from the annotated risk events crawled from the Wikipedia. Then, we compare the relational dependencies of risks in the WEF and Wikipedia networks, and find that they share over 50% of their edges. Moreover, the edges unique to each network signify the different perspectives of the experts and the public on global risks. To reduce the cost of manual annotation of events triggering risk activation, we build an auto-detection tool which filters out over 80% media reported events unrelated to the global risks. In the process of filtering, our tool also continuously learns keywords relevant to global risks from the event sentences. Using locations of events extracted from the risk knowledge graph, we find characteristics of geographical distributions of the categories of global risks.
Can public social media data be harnessed to predict COVID-19 case counts? We analyzed approximately 15 million COVID-19 related posts on Weibo, a popular Twitter-like social media platform in China, from November 1, 2019 to March 31, 2020. We developed a machine learning classifier to identify sick posts, which are reports of ones own and other peoples symptoms and diagnosis related to COVID-19. We then modeled the predictive power of sick posts and other COVID-19 posts on daily case counts. We found that reports of symptoms and diagnosis of COVID-19 significantly predicted daily case counts, up to 14 days ahead of official statistics. But other COVID-19 posts did not have similar predictive power. For a subset of geotagged posts (3.10% of all retrieved posts), we found that the predictive pattern held true for both Hubei province and the rest of mainland China, regardless of unequal distribution of healthcare resources and outbreak timeline. Researchers and disease control agencies should pay close attention to the social media infosphere regarding COVID-19. On top of monitoring overall search and posting activities, it is crucial to sift through the contents and efficiently identify true signals from noise.
Networks such as social networks, airplane networks, and citation networks are ubiquitous. The adjacency matrix is often adopted to represent a network, which is usually high dimensional and sparse. However, to apply advanced machine learning algorithms to network data, low-dimensional and continuous representations are desired. To achieve this goal, many network embedding methods have been proposed recently. The majority of existing methods facilitate the local information i.e. local connections between nodes, to learn the representations, while completely neglecting global information (or node status), which has been proven to boost numerous network mining tasks such as link prediction and social recommendation. Hence, it also has potential to advance network embedding. In this paper, we study the problem of preserving local and global information for network embedding. In particular, we introduce an approach to capture global information and propose a network embedding framework LOG, which can coherently model {bf LO}cal and {bf G}lobal information. Experimental results demonstrate the ability to preserve global information of the proposed framework. Further experiments are conducted to demonstrate the effectiveness of learned representations of the proposed framework.
Risk and response communication of public agencies through social media played a significant role in the emergence and spread of novel Coronavirus (COVID-19) and such interactions were echoed in other information outlets. This study collected time-sensitive online social media data and analyzed such communication patterns from public health (WHO, CDC), emergency (FEMA), and transportation (FDOT) agencies using data-driven methods. The scope of the work includes a detailed understanding of how agencies communicate risk information through social media during a pandemic and influence community response (i.e. timing of lockdown, timing of reopening) and disease outbreak indicators (i.e. number of confirmed cases, number of deaths). The data includes Twitter interactions from different agencies (2.15K tweets per agency on average) and crowdsourced data (i.e. Worldometer) on COVID-19 cases and deaths were observed between February 21, 2020 and June 06, 2020. Several machine learning techniques such as (i.e. topic mining and sentiment ratings over time) are applied here to identify the dynamics of emergent topics during this unprecedented time. Temporal infographics of the results captured the agency-levels variations over time in circulating information about the importance of face covering, home quarantine, social distancing and contact tracing. In addition, agencies showed differences in their discussions about community transmission, lack of personal protective equipment, testing and medical supplies, use of tobacco, vaccine, mental health issues, hospitalization, hurricane season, airports, construction work among others. Findings could support more efficient transfer of risk and response information as communities shift to new normal as well as in future pandemics.
Groups of firms often achieve a competitive advantage through the formation of geo-industrial clusters. Although many exemplary clusters, such as Hollywood or Silicon Valley, have been frequently studied, systematic approaches to identify and analyze the hierarchical structure of the geo-industrial clusters at the global scale are rare. In this work, we use LinkedIns employment histories of more than 500 million users over 25 years to construct a labor flow network of over 4 million firms across the world and apply a recursive network community detection algorithm to reveal the hierarchical structure of geo-industrial clusters. We show that the resulting geo-industrial clusters exhibit a stronger association between the influx of educated-workers and financial performance, compared to existing aggregation units. Furthermore, our additional analysis of the skill sets of educated-workers supplements the relationship between the labor flow of educated-workers and productivity growth. We argue that geo-industrial clusters defined by labor flow provide better insights into the growth and the decline of the economy than other common economic units.
Networks can describe the structure of a wide variety of complex systems by specifying which pairs of entities in the system are connected. While such pairwise representations are flexible, they are not necessarily appropriate when the fundamental interactions involve more than two entities at the same time. Pairwise representations nonetheless remain ubiquitous, because higher-order interactions are often not recorded explicitly in network data. Here, we introduce a Bayesian approach to reconstruct latent higher-order interactions from ordinary pairwise network data. Our method is based on the principle of parsimony and only includes higher-order structures when there is sufficient statistical evidence for them. We demonstrate its applicability to a wide range of datasets, both synthetic and empirical.