No Arabic abstract
Inspired by the social and economic benefits of diversity, we analyze over 9 million papers and 6 million scientists to study the relationship between research impact and five classes of diversity: ethnicity, discipline, gender, affiliation, and academic age. Using randomized baseline models, we establish the presence of homophily in ethnicity, gender and affiliation. We then study the effect of diversity on scientific impact, as reflected in citations. Remarkably, of the classes considered, ethnic diversity had the strongest correlation with scientific impact. To further isolate the effects of ethnic diversity, we used randomized baseline models and again found a clear link between diversity and impact. To further support these findings, we use coarsened exact matching to compare the scientific impact of ethnically diverse papers and scientists with closely-matched control groups. Here, we find that ethnic diversity resulted in an impact gain of 10.63% for papers, and 47.67% for scientists.
Modern science is dominated by scientific productions from teams. A recent finding shows that teams with both large and small sizes are essential in research, prompting us to analyze the extent to which a countrys scientific work is carried out by big/small teams. Here, using over 26 million publications from Web of Science, we find that Chinas research output is more dominated by big teams than the rest of the world, which is particularly the case in fields of natural science. Despite the global trend that more papers are done by big teams, Chinas drop in small team output is much steeper. As teams in China shift from small to large size, the team diversity that is essential for innovative works does not increase as much as that in other countries. Using the national average as the baseline, we find that the National Natural Science Foundation of China (NSFC) supports fewer small team works than the National Science Foundation of U.S. (NSF) does, implying that big teams are more preferred by grant agencies in China. Our finding provides new insights into the concern of originality and innovation in China, which urges a need to balance small and big teams.
We provide an up-to-date view on the knowledge management system ScienceWISE (SW) and address issues related to the automatic assignment of articles to research topics. So far, SW has been proven to be an effective platform for managing large volumes of technical articles by means of ontological concept-based browsing. However, as the publication of research articles accelerates, the expressivity and the richness of the SW ontology turns into a double-edged sword: a more fine-grained characterization of articles is possible, but at the cost of introducing more spurious relations among them. In this context, the challenge of continuously recommending relevant articles to users lies in tackling a network partitioning problem, where nodes represent articles and co-occurring concepts create edges between them. In this paper, we discuss the three research directions we have taken for solving this issue: i) the identification of generic concepts to reinforce inter-article similarities; ii) the adoption of a bipartite network representation to improve scalability; iii) the design of a clustering algorithm to identify concepts for cross-disciplinary articles and obtain fine-grained topics for all articles.
Science is built upon scholarship consensus that changes over time. This raises the question of how revolutionary theories and assumptions are evaluated and accepted into the norm of science as the setting for the next science. Using two recently proposed metrics, we identify the novel paper with high atypicality, which models how research draws upon unusual combinations of prior research in crafting their own contributions, and evaluate recognition to novel papers by citation and disruption, which captures the degree to which a research article creates a new direction by eclipsing citations to the prior work it builds upon. Only a small fraction of papers (2.3%) are highly novel, and there are fewer novel papers over time, with a nearly threefold decrease from 3.9% in 1970 to 1.4% in 2000. A highly novel paper indeed has a much higher chance (61.3%) to disrupt science than conventional papers (36.4%), but this recognition only comes from a distant future as reflected in citations, and it typically takes 10 years or longer for the disruption score of a paper to stabilize. In comparison, only nearly 20% of scholars survived in academia over this long period, measured in publications. We also provide the first computational model reformulating atypicality as the distance across the latent knowledge spaces learned by neural networks, as a proxy to the socially agreed relevance between distinct fields of scientific knowledge. The evolution of this knowledge space characterizes how yesterdays novelty forms todays scientific conventions, which condition the novelty--and surprise--of tomorrows breakthroughs. This computational model may be used to inform science policy that aims to recognize and cultivate novelty, so as to mitigate the conflict between individual career success and collective advance in science and direct human creativity to the unknown frontier of scientific knowledge.
Throughout history, a relatively small number of individuals have made a profound and lasting impact on science and society. Despite long-standing, multi-disciplinary interests in understanding careers of elite scientists, there have been limited attempts for a quantitative, career-level analysis. Here, we leverage a comprehensive dataset we assembled, allowing us to trace the entire career histories of nearly all Nobel laureates in physics, chemistry, and physiology or medicine over the past century. We find that, although Nobel laureates were energetic producers from the outset, producing works that garner unusually high impact, their careers before winning the prize follow relatively similar patterns as ordinary scientists, being characterized by hot streaks and increasing reliance on collaborations. We also uncovered notable variations along their careers, often associated with the Nobel prize, including shifting coauthorship structure in the prize-winning work, and a significant but temporary dip in the impact of work they produce after winning the Nobel. Together, these results document quantitative patterns governing the careers of scientific elites, offering an empirical basis for a deeper understanding of the hallmarks of exceptional careers in science.
Nowadays, researchers have moved to platforms like Twitter to spread information about their ideas and empirical evidence. Recent studies have shown that social media affects the scientific impact of a paper. However, these studies only utilize the tweet counts to represent Twitter activity. In this paper, we propose TweetPap, a large-scale dataset that introduces temporal information of citation/tweets and the metadata of the tweets to quantify and understand the discourse of scientific papers on social media. The dataset is publicly available at https://github.com/lingo-iitgn/TweetPap