No Arabic abstract
The advancement of science as outlined by Popper and Kuhn is largely qualitative, but with bibliometric data it is possible and desirable to develop a quantitative picture of scientific progress. Furthermore it is also important to allocate finite resources to research topics that have growth potential, to accelerate the process from scientific breakthroughs to technological innovations. In this paper, we address this problem of quantitative knowledge evolution by analysing the APS publication data set from 1981 to 2010. We build the bibliographic coupling and co-citation networks, use the Louvain method to detect topical clusters (TCs) in each year, measure the similarity of TCs in consecutive years, and visualize the results as alluvial diagrams. Having the predictive features describing a given TC and its known evolution in the next year, we can train a machine learning model to predict future changes of TCs, i.e., their continuing, dissolving, merging and splitting. We found the number of papers from certain journals, the degree, closeness, and betweenness to be the most predictive features. Additionally, betweenness increases significantly for merging events, and decreases significantly for splitting events. Our results represent a first step from a descriptive understanding of the Science of Science (SciSci), towards one that is ultimately prescriptive.
Human mobility has a significant impact on several layers of society, from infrastructural planning and economics to the spread of diseases and crime. Representing the system as a complex network, in which nodes are assigned to regions (e.g., a city) and links indicate the flow of people between two of them, physics-inspired models have been proposed to quantify the number of people migrating from one city to the other. Despite the advances made by these models, our ability to predict the number of commuters and reconstruct mobility networks remains limited. Here, we propose an alternative approach using machine learning and 22 urban indicators to predict the flow of people and reconstruct the intercity commuters network. Our results reveal that predictions based on machine learning algorithms and urban indicators can reconstruct the commuters network with 90.4% of accuracy and describe 77.6% of the variance observed in the flow of people between cities. We also identify essential features to recover the network structure and the urban indicators mostly related to commuting patterns. As previously reported, distance plays a significant role in commuting, but other indicators, such as Gross Domestic Product (GDP) and unemployment rate, are also driven-forces for people to commute. We believe that our results shed new lights on the modeling of migration and reinforce the role of urban indicators on commuting patterns. Also, because link-prediction and network reconstruction are still open challenges in network science, our results have implications in other areas, like economics, social sciences, and biology, where node attributes can give us information about the existence of links connecting entities in the network.
This pair of CAS lectures gives an introduction for accelerator physics students to the framework and terminology of machine learning (ML). We start by introducing the language of ML through a simple example of linear regression, including a probabilistic perspective to introduce the concepts of maximum likelihood estimation (MLE) and maximum a priori (MAP) estimation. We then apply the concepts to examples of neural networks and logistic regression. Next we introduce non-parametric models and the kernel method and give a brief introduction to two other machine learning paradigms, unsupervised and reinforcement learning. Finally we close with example applications of ML at a free-electron laser.
High-temperature alloy design requires a concurrent consideration of multiple mechanisms at different length scales. We propose a workflow that couples highly relevant physics into machine learning (ML) to predict properties of complex high-temperature alloys with an example of the 9-12 wt.% Cr steels yield strength. We have incorporated synthetic alloy features that capture microstructure and phase transformations into the dataset. Identified high impact features that affect yield strength of 9Cr from correlation analysis agree well with the generally accepted strengthening mechanism. As part of the verification process, the consistency of sub-datasets has been extensively evaluated with respect to temperature and then refined for the boundary conditions of trained ML models. The predicted yield strength of 9Cr steels using the ML models is in excellent agreement with experiments. The current approach introduces physically meaningful constraints in interrogating the trained ML models to predict properties of hypothetical alloys when applied to data-driven materials.
Even as we advance the frontiers of physics knowledge, our understanding of how this knowledge evolves remains at the descriptive levels of Popper and Kuhn. Using the APS publications data sets, we ask in this letter how new knowledge is built upon old knowledge. We do so by constructing year-to-year bibliographic coupling networks, and identify in them validated communities that represent different research fields. We then visualize their evolutionary relationships in the form of alluvial diagrams, and show how they remain intact through APS journal splits. Quantitatively, we see that most fields undergo weak Popperian mixing, and it is rare for a field to remain isolated/undergo strong mixing. The sizes of fields obey a simple linear growth with recombination. We can also reliably predict the merging between two fields, but not for the considerably more complex splitting. Finally, we report a case study of two fields that underwent repeated merging and splitting around 1995, and how these Kuhnian events are correlated with breakthroughs on BEC, quantum teleportation, and slow light. This impact showed up quantitatively in the citations of the BEC field as a larger proportion of references from during and shortly after these events.
Classical and exceptional Lie algebras and their representations are among the most important tools in the analysis of symmetry in physical systems. In this letter we show how the computation of tensor products and branching rules of irreducible representations are machine-learnable, and can achieve relative speed-ups of orders of magnitude in comparison to the non-ML algorithms.