Research papers, master and doctoral theses published by Jing Xu

Extended Kohler$^,$s Rule of Magnetoresistance

350 - Jing Xu , Fei Han , Ting-Ting Wang 2021

A notable phenomenon in topological semimetals is the violation of Kohler$^,$s rule, which dictates that the magnetoresistance $MR$ obeys a scaling behavior of $MR = f(H/rho_0$), where $MR = [rho_H-rho_0]/rho_0$ and $H$ is the magnetic field, with $rho_H$ and $rho_0$ being the resistivity at $H$ and zero field, respectively. Here we report a violation originating from thermally-induced change in the carrier density. We find that the magnetoresistance of the Weyl semimetal, TaP, follows an extended Kohler$^,$s rule $MR = f[H/(n_Trho_0)]$, with $n_T$ describing the temperature dependence of the carrier density. We show that $n_T$ is associated with the Fermi level and the dispersion relation of the semimetal, providing a new way to reveal information on the electronic bandstructure. We offer a fundamental understanding of the violation and validity of Kohler$^,$s rule in terms of different temperature-responses of $n_T$. We apply our extended Kohler$^,$s rule to BaFe$_2$(As$_{1-x}$P$_x$)$_2$ to settle a long-standing debate on the scaling behavior of the normal-state magnetoresistance of a superconductor, namely, $MR$ ~ $tan^2theta_H$, where $theta_H$ is the Hall angle. We further validate the extended Kohler$^,$s rule and demonstrate its generality in a semiconductor, InSb, where the temperature-dependent carrier density can be reliably determined both theoretically and experimentally.

Materials Science Strongly Correlated Electrons

Boosting Graph Search with Attention Network for Solving the General Orienteering Problem

104 - Zongtao Liu , Jing Xu , Jintao Su 2021

Recently, several studies have explored the use of neural network to solve different routing problems, which is an auspicious direction. These studies usually design an encoder-decoder based framework that uses encoder embeddings of nodes and the problem-specific context to produce node sequence(path), and further optimize the produced result on top by beam search. However, existing models can only support node coordinates as input, ignore the self-referential property of the studied routing problems, and lack the consideration about the low reliability in the initial stage of node selection, thus are hard to be applied in real-world. In this paper, we take the orienteering problem as an example to tackle these limitations. We propose a novel combination of a variant beam search algorithm and a learned heuristic for solving the general orienteering problem. We acquire the heuristic with an attention network that takes the distances among nodes as input, and learn it via a reinforcement learning framework. The empirical studies show that our method can surpass a wide range of baselines and achieve results close to the optimal or highly specialized approach. Also, our proposed framework can be easily applied to other routing problems. Our code is publicly available.

Artificial Intelligence

Voxel Transformer for 3D Object Detection

129 - Jiageng Mao , Yujing Xue , Minzhe Niu 2021

We present Voxel Transformer (VoTr), a novel and effective voxel-based Transformer backbone for 3D object detection from point clouds. Conventional 3D convolutional backbones in voxel-based 3D detectors cannot efficiently capture large context information, which is crucial for object recognition and localization, owing to the limited receptive fields. In this paper, we resolve the problem by introducing a Transformer-based architecture that enables long-range relationships between voxels by self-attention. Given the fact that non-empty voxels are naturally sparse but numerous, directly applying standard Transformer on voxels is non-trivial. To this end, we propose the sparse voxel module and the submanifold voxel module, which can operate on the empty and non-empty voxel positions effectively. To further enlarge the attention range while maintaining comparable computational overhead to the convolutional counterparts, we propose two attention mechanisms for multi-head attention in those two modules: Local Attention and Dilated Attention, and we further propose Fast Voxel Query to accelerate the querying process in multi-head attention. VoTr contains a series of sparse and submanifold voxel modules and can be applied in most voxel-based detectors. Our proposed VoTr shows consistent improvement over the convolutional baselines while maintaining computational efficiency on the KITTI dataset and the Waymo Open dataset.

Computer Vision and Pattern Recognition

Beyond Goldfish Memory: Long-Term Open-Domain Conversation

128 - Jing Xu , Arthur Szlam , Jason Weston 2021

Despite recent improvements in open-domain dialogue models, state of the art models are trained and evaluated on short conversations with little context. In contrast, the long-term conversation setting has hardly been studied. In this work we collect and release a human-human dataset consisting of multiple chat sessions whereby the speaking partners learn about each others interests and discuss the things they have learnt from past sessions. We show how existing models trained on existing datasets perform poorly in this long-term conversation setting in both automatic and human evaluations, and we study long-context models that can perform much better. In particular, we find retrieval-augmented methods and methods with an ability to summarize and recall previous conversations outperform the standard encoder-decoder architectures currently considered state of the art.

Computation and Language Artificial Intelligence

FedCM: Federated Learning with Client-level Momentum

90 - Jing Xu , Sen Wang , Liwei Wang 2021

Federated Learning is a distributed machine learning approach which enables model training without data sharing. In this paper, we propose a new federated learning algorithm, Federated Averaging with Client-level Momentum (FedCM), to tackle problems of partial participation and client heterogeneity in real-world federated learning applications. FedCM aggregates global gradient information in previous communication rounds and modifies client gradient descent with a momentum-like term, which can effectively correct the bias and improve the stability of local SGD. We provide theoretical analysis to highlight the benefits of FedCM. We also perform extensive empirical studies and demonstrate that FedCM achieves superior performance in various tasks and is robust to different levels of client numbers, participation rate and client heterogeneity.

Machine Learning

On Anytime Learning at Macroscale

204 - Lucas Caccia , Jing Xu , Myle Ott 2021

Classical machine learning frameworks assume access to a possibly large dataset in order to train a predictive model. In many practical applications however, data does not arrive all at once, but in batches over time. This creates a natural trade-off between accuracy of a model and time to obtain such a model. A greedy predictor could produce non-trivial predictions by immediately training on batches as soon as these become available but, it may also make sub-optimal use of future data. On the other hand, a tardy predictor could wait for a long time to aggregate several batches into a larger dataset, but ultimately deliver a much better performance. In this work, we consider such a streaming learning setting, which we dub {em anytime learning at macroscale} (ALMA). It is an instance of anytime learning applied not at the level of a single chunk of data, but at the level of the entire sequence of large batches. We first formalize this learning setting, we then introduce metrics to assess how well learners perform on the given task for a given memory and compute budget, and finally we test several baseline approaches on standard benchmarks repurposed for anytime learning at macroscale. The general finding is that bigger models always generalize better. In particular, it is important to grow model capacity over time if the initial model is relatively small. Moreover, updating the model at an intermediate rate strikes the best trade off between accuracy and time to obtain a useful predictor.

Machine Learning Computer Vision and Pattern Recognition

AFINet: Attentive Feature Integration Networks for Image Classification

76 - Xinglin Pan , Jing Xu , Yu Pan 2021

Convolutional Neural Networks (CNNs) have achieved tremendous success in a number of learning tasks including image classification. Recent advanced models in CNNs, such as ResNets, mainly focus on the skip connection to avoid gradient vanishing. DenseNet designs suggest creating additional bypasses to transfer features as an alternative strategy in network design. In this paper, we design Attentive Feature Integration (AFI) modules, which are widely applicable to most recent network architectures, leading to new architectures named AFI-Nets. AFI-Nets explicitly model the correlations among different levels of features and selectively transfer features with a little overhead.AFI-ResNet-152 obtains a 1.24% relative improvement on the ImageNet dataset while decreases the FLOPs by about 10% and the number of parameters by about 9.2% compared to ResNet-152.

Computer Vision and Pattern Recognition

Noise Attention based Spectrum Anomaly Detection Method for Unauthorized Bands

83 - Jing Xu , Yu Tian , Shuai Yuan 2021

Spectrum anomaly detection is of great importance in wireless communication to secure safety and improve spectrum efficiency. However, spectrum anomaly detection faces many difficulties, especially in unauthorized frequency bands. For example, the composition of unauthorized frequency bands is very complex and the abnormal usage patterns are unknown in prior. In this paper, a noise attention method is proposed for unsupervised spectrum anomaly detection in unauthorized bands. First of all, we theoretically prove that the anomalies in unauthorized bands will raise the noise floor of spectrogram after VAE reconstruction. Then, we introduce a novel anomaly metric named as noise attention score to more effectively capture spectrum anomaly. The effectiveness of the proposed method is experimentally verified in 2.4 GHz ISM band. Leveraging the noise attention score, the AUC metric of anomaly detection is increased by 0.193. The proposed method is beneficial to reliably detecting abnormal spectrum while keeping low false alarm rate.

Signal Processing

Explainability-based Backdoor Attacks Against Graph Neural Networks

91 - Jing Xu , Minhui Xue , Stjepan Picek 2021

Backdoor attacks represent a serious threat to neural network models. A backdoored model will misclassify the trigger-embedded inputs into an attacker-chosen target label while performing normally on other benign inputs. There are already numerous works on backdoor attacks on neural networks, but only a few works consider graph neural networks (GNNs). As such, there is no intensive research on explaining the impact of trigger injecting position on the performance of backdoor attacks on GNNs. To bridge this gap, we conduct an experimental investigation on the performance of backdoor attacks on GNNs. We apply two powerful GNN explainability approaches to select the optimal trigger injecting position to achieve two attacker objectives -- high attack success rate and low clean accuracy drop. Our empirical results on benchmark datasets and state-of-the-art neural network models demonstrate the proposed methods effectiveness in selecting trigger injecting position for backdoor attacks on GNNs. For instance, on the node classification task, the backdoor attack with trigger injecting position selected by GraphLIME reaches over $84 %$ attack success rate with less than $2.5 %$ accuracy drop

Machine Learning Cryptography and Security

Non-reciprocal energy transfer through the Casimir effect

92 - Zhujing Xu , Xingyu Gao , Jaehoon Bang 2021

A fundamental prediction of quantum mechanics is that there are random fluctuations everywhere in a vacuum because of the zero-point energy. Remarkably, quantum electromagnetic fluctuations can induce a measurable force between neutral objects, known as the Casimir effect, which has attracted broad interests. The Casimir effect can dominate the interaction between microstructures at small separations and has been utilized to realize nonlinear oscillation, quantum trapping, phonon transfer, and dissipation dilution. However, a non-reciprocal device based on quantum vacuum fluctuations remains an unexplored frontier. Here we report quantum vacuum mediated non-reciprocal energy transfer between two micromechanical oscillators. We modulate the Casimir interaction parametrically to realize strong coupling between two oscillators with different resonant frequencies. We engineer the systems spectrum to have an exceptional point in the parameter space and observe the asymmetric topological structure near it. By dynamically changing the parameters near the exceptional point and utilizing the non-adiabaticity of the process, we achieve non-reciprocal energy transfer with high contrast. Our work represents an important development in utilizing quantum vacuum fluctuations to regulate energy transfer at the nanoscale and build functional Casimir devices.

Quantum Physics Mesoscale and Nanoscale Physics Optics

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد