ترغب بنشر مسار تعليمي؟ اضغط هنا

Nodes in networks may have one or more functions that determine their role in the system. As opposed to local proximity, which captures the local context of nodes, the role identity captures the functional role that nodes play in a network, such as b eing the center of a group, or the bridge between two groups. This means that nodes far apart in a network can have similar structural role identities. Several recent works have explored methods for embedding the roles of nodes in networks. However, these methods all rely on either approximating or indirect modeling of structural equivalence. In this paper, we present a novel and flexible framework using stress majorization, to transform the high-dimensional role identities in networks directly (without approximation or indirect modeling) to a low-dimensional embedding space. Our method is also flexible, in that it does not rely on specific structural similarity definitions. We evaluated our method on the tasks of node classification, clustering, and visualization, using three real-world and five synthetic networks. Our experiments show that our framework achieves superior results than existing methods in learning node role representations.
Recent years have seen a rise in the development of representational learning methods for graph data. Most of these methods, however, focus on node-level representation learning at various scales (e.g., microscopic, mesoscopic, and macroscopic node e mbedding). In comparison, methods for representation learning on whole graphs are currently relatively sparse. In this paper, we propose a novel unsupervised whole graph embedding method. Our method uses spectral graph wavelets to capture topological similarities on each k-hop sub-graph between nodes and uses them to learn embeddings for the whole graph. We evaluate our method against 12 well-known baselines on 4 real-world datasets and show that our method achieves the best performance across all experiments, outperforming the current state-of-the-art by a considerable margin.
A key problem in multi-task learning (MTL) research is how to select high-quality auxiliary tasks automatically. This paper presents GradTS, an automatic auxiliary task selection method based on gradient calculation in Transformer-based models. Compa red to AUTOSEM, a strong baseline method, GradTS improves the performance of MT-DNN with a bert-base-cased backend model, from 0.33% to 17.93% on 8 natural language understanding (NLU) tasks in the GLUE benchmarks. GradTS is also time-saving since (1) its gradient calculations are based on single-task experiments and (2) the gradients are re-used without additional experiments when the candidate task set changes. On the 8 GLUE classification tasks, for example, GradTS costs on average 21.32% less time than AUTOSEM with comparable GPU consumption. Further, we show the robustness of GradTS across various task settings and model selections, e.g. mixed objectives among candidate tasks. The efficiency and efficacy of GradTS in these case studies illustrate its general applicability in MTL research without requiring manual task filtering or costly parameter tuning.
We propose a new kind of geometric effective theory based on curved space-time single valley Dirac theory with spin connection for twisted bilayer graphene under generic twist angle. This model can reproduce the nearly flat bands with particle-hole s ymmetry around the first magic angle. The band width is near the former results given by Bistritzer-MacDonald model or density matrix renormalization group. Even more, such geometric formalism allows one to predict the properties of rotating bilayer graphene which cannot be accessed by former theories. As an example, we investigate the Bott index of a rotating bilayer graphene. We relate this to the two-dimensional Thouless pump with quantized charge pumping during one driving period which could be verified by transport measurement.
Recently, some works found an interesting phenomenon that adversarially robust classifiers can generate good images comparable to generative models. We investigate this phenomenon from an energy perspective and provide a novel explanation. We reformu late adversarial example generation, adversarial training, and image generation in terms of an energy function. We find that adversarial training contributes to obtaining an energy function that is flat and has low energy around the real data, which is the key for generative capability. Based on our new understanding, we further propose a better adversarial training method, Joint Energy Adversarial Training (JEAT), which can generate high-quality images and achieve new state-of-the-art robustness under a wide range of attacks. The Inception Score of the images (CIFAR-10) generated by JEAT is 8.80, much better than original robust classifiers (7.50).
This paper studies the relative importance of attention heads in Transformer-based models to aid their interpretability in cross-lingual and multi-lingual tasks. Prior research has found that only a few attention heads are important in each mono-ling ual Natural Language Processing (NLP) task and pruning the remaining heads leads to comparable or improved performance of the model. However, the impact of pruning attention heads is not yet clear in cross-lingual and multi-lingual tasks. Through extensive experiments, we show that (1) pruning a number of attention heads in a multi-lingual Transformer-based model has, in general, positive effects on its performance in cross-lingual and multi-lingual tasks and (2) the attention heads to be pruned can be ranked using gradients and identified with a few trial experiments. Our experiments focus on sequence labeling tasks, with potential applicability on other cross-lingual and multi-lingual tasks. For comprehensiveness, we examine two pre-trained multi-lingual models, namely multi-lingual BERT (mBERT) and XLM-R, on three tasks across 9 languages each. We also discuss the validity of our findings and their extensibility to truly resource-scarce languages and other task settings.
Anomaly detection plays a key role in industrial manufacturing for product quality control. Traditional methods for anomaly detection are rule-based with limited generalization ability. Recent methods based on supervised deep learning are more powerf ul but require large-scale annotated datasets for training. In practice, abnormal products are rare thus it is very difficult to train a deep model in a fully supervised way. In this paper, we propose a novel unsupervised anomaly detection approach based on Self-organizing Map (SOM). Our method, Self-organizing Map for Anomaly Detection (SOMAD) maintains normal characteristics by using topological memory based on multi-scale features. SOMAD achieves state-of the-art performance on unsupervised anomaly detection and localization on the MVTec dataset.
We use numerical relativity to study the merger and ringdown stages of superkick binary black hole systems (those with equal mass and anti-parallel spins). We find a universal way to describe the mass and current quadrupole gravitational waves emitte d by these systems during the merger and ringdown stage: (i) The time evolutions of these waves are insensitive to the progenitors parameters (spins) after being normalized by their own peak values. (ii) The peak values, which encode all the spin information of the progenitor, can be consistently fitted to formulas inspired by post-Newtonian theory. We find that the universal evolution of the mass quadrupole wave can be accurately modeled by the so-called Backwards One-Body (BOB) model. However, the BOB model, in its present form, leads to a lower waveform match and a significant parameter-estimation bias for the current quadrupole wave. We also decompose the ringdown signal into seven overtones, and study the dependence of mode amplitudes on the progenitors parameters. Such dependence is found to be insensitive to the overtone index (up to a scaling factor). Finally, we use the Fisher matrix technique to investigate how the ringdown waveform can be at least as important for parameter estimation as the inspiral stage. Assuming the Cosmic Explorer, we find the contribution of ringdown portion dominates as the total mass exceeds ~ 250 solar mass. For massive BBH systems, the accuracy of parameter measurement is improved by incorporating the information of ringdown -- the ringdown sector gives rise to a different parameter correlation from inspiral stage, hence the overall parameter correlation is reduced in the full signal.
162 - Shaobo Liu , Jie Yuan , Sheng Ma 2021
The angular-dependent magnetoresistance (AMR) of the ab plane is measured on the single crystals of FeSe1-xSx (x = 0, 0.07, 0.13 and 1) and FeSe1-yTey (y = 0.06, 0.61 and 1) at various temperatures under fields up to 9 T. A pronounced twofold-anisotr opic carrier-scattering effect is identified by AMR, and attributed to a magnetic-field-induced spin nematicity that emerges from the tetragonal normal-state regime below a characteristic temperature Tsn. This magnetically polarized spin nematicity is found to be ubiquitous in the isoelectronic FeSe1-xSx and FeSe1-yTey systems, no matter whether the sample shows an electronic nematic order at Ts < Tsn, or an antiferromagnetic order at TN < Tsn, or neither order. Importantly, we find that the isoelectronic substitution with sulfur does not suppress but even enhances the characteristic Tsn of the induced spin nematicity in FeSe1-xSx samples. This contrasts sharply with their rapidly suppressed Ts, the transition temperature of the spontaneous electronic nematicity. Furthermore, we find that the superconductivity is significantly suppressed with the enhancement of the induced spin nematicity in both FeSe1-xSx and FeSe1-yTey samples.
Traditional crowd counting approaches usually use Gaussian assumption to generate pseudo density ground truth, which suffers from problems like inaccurate estimation of the Gaussian kernel sizes. In this paper, we propose a new measure-based counting approach to regress the predicted density maps to the scattered point-annotated ground truth directly. First, crowd counting is formulated as a measure matching problem. Second, we derive a semi-balanced form of Sinkhorn divergence, based on which a Sinkhorn counting loss is designed for measure matching. Third, we propose a self-supervised mechanism by devising a Sinkhorn scale consistency loss to resist scale changes. Finally, an efficient optimization method is provided to minimize the overall loss function. Extensive experiments on four challenging crowd counting datasets namely ShanghaiTech, UCF-QNRF, JHU++, and NWPU have validated the proposed method.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا