أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Bo Wang

PowerGym: A Reinforcement Learning Environment for Volt-Var Control in Power Distribution Systems

79 - Ting-Han Fan , Xian Yeow Lee , Yubo Wang 2021

We introduce PowerGym, an open-source reinforcement learning environment for Volt-Var control in power distribution systems. Following OpenAI Gym APIs, PowerGym targets minimizing power loss and voltage violations under physical networked constraints . PowerGym provides four distribution systems (13Bus, 34Bus, 123Bus, and 8500Node) based on IEEE benchmark systems and design variants for various control difficulties. To foster generalization, PowerGym offers a detailed customization guide for users working with their distribution systems. As a demonstration, we examine state-of-the-art reinforcement learning algorithms in PowerGym and validate the environment by studying controller behaviors.

التعلم الآلي الذكاء الاصطناعي

NumGPT: Improving Numeracy Ability of Generative Pre-trained Models

214 - Zhihua Jin , Xin Jiang , Xingbo Wang 2021

Existing generative pre-trained language models (e.g., GPT) focus on modeling the language structure and semantics of general texts. However, those models do not consider the numerical properties of numbers and cannot perform robustly on numerical re asoning tasks (e.g., math word problems and measurement estimation). In this paper, we propose NumGPT, a generative pre-trained model that explicitly models the numerical properties of numbers in texts. Specifically, it leverages a prototype-based numeral embedding to encode the mantissa of the number and an individual embedding to encode the exponent of the number. A numeral-aware loss function is designed to integrate numerals into the pre-training objective of NumGPT. We conduct extensive experiments on four different datasets to evaluate the numeracy ability of NumGPT. The experiment results show that NumGPT outperforms baseline models (e.g., GPT and GPT with DICE) on a range of numerical reasoning tasks such as measurement estimation, number comparison, math word problems, and magnitude classification. Ablation studies are also conducted to evaluate the impact of pre-training and model hyperparameters on the performance.

الحساب واللغة التعلم الآلي

Eliminating Sentiment Bias for Aspect-Level Sentiment Classification with Unsupervised Opinion Extraction

103 - Bo Wang , Tao Shen , Guodong Long 2021

Aspect-level sentiment classification (ALSC) aims at identifying the sentiment polarity of a specified aspect in a sentence. ALSC is a practical setting in aspect-based sentiment analysis due to no opinion term labeling needed, but it fails to interp ret why a sentiment polarity is derived for the aspect. To address this problem, recent works fine-tune pre-trained Transformer encoders for ALSC to extract an aspect-centric dependency tree that can locate the opinion words. However, the induced opinion words only provide an intuitive cue far below human-level interpretability. Besides, the pre-trained encoder tends to internalize an aspects intrinsic sentiment, causing sentiment bias and thus affecting model performance. In this paper, we propose a span-based anti-bias aspect representation learning framework. It first eliminates the sentiment bias in the aspect embedding by adversarial learning against aspects prior sentiment. Then, it aligns the distilled opinion candidates with the aspect by span-based dependency modeling to highlight the interpretable opinion terms. Our method achieves new state-of-the-art performance on five benchmarks, with the capability of unsupervised opinion extraction.

الحساب واللغة

Assessing the Knowledge State of Online Students -- New Data, New Approaches, Improved Accuracy

70 - Robin Schmucker , Jingbo Wang , Shijia Hu 2021

We consider the problem of assessing the changing knowledge state of individual students as they go through online courses. This student performance (SP) modeling problem, also known as knowledge tracing, is a critical step for building adaptive onli ne teaching systems. Specifically, we conduct a study of how to utilize various types and large amounts of students log data to train accurate machine learning models that predict the knowledge state of future students. This study is the first to use four very large datasets made available recently from four distinct intelligent tutoring systems. Our results include a new machine learning approach that defines a new state of the art for SP modeling, improving over earlier methods in several ways: First, we achieve improved accuracy by introducing new features that can be easily computed from conventional question-response logs (e.g., the pattern in the students most recent answers). Second, we take advantage of features of the student history that go beyond question-response pairs (e.g., which video segments the student watched, or skipped) as well as information about prerequisite structure in the curriculum. Third, we train multiple specialized modeling models for different aspects of the curriculum (e.g., specializing in early versus later segments of the student history), then combine these specialized models to create a group prediction of student knowledge. Taken together, these innovations yield an average AUC score across these four datasets of 0.807 compared to the previous best logistic regression approach score of 0.766, and also outperforming state-of-the-art deep neural net approaches. Importantly, we observe consistent improvements from each of our three methodological innovations, in each dataset, suggesting that our methods are of general utility and likely to produce improvements for other online tutoring systems as well.

التعلم الآلي أجهزة الكمبيوتر والمجتمع

A Longitudinal Multi-modal Dataset for Dementia Monitoring and Diagnosis

100 - Dimitris Gkoumas , Bo Wang , Adam Tsakalidis 2021

Dementia is a family of neurogenerative conditions affecting memory and cognition in an increasing number of individuals in our globally aging population. Automated analysis of language, speech and paralinguistic indicators have been gaining populari ty as potential indicators of cognitive decline. Here we propose a novel longitudinal multi-modal dataset collected from people with mild dementia and age matched controls over a period of several months in a natural setting. The multi-modal data consists of spoken conversations, a subset of which are transcribed, as well as typed and written thoughts and associated extra-linguistic information such as pen strokes and keystrokes. We describe the dataset in detail and proceed to focus on a task using the speech modality. The latter involves distinguishing controls from people with dementia by exploiting the longitudinal nature of the data. Our experiments showed significant differences in how the speech varied from session to session in the control and dementia groups.

الحساب واللغة الذكاء الاصطناعي الوسائط المتعددة

Revisit the isospin violating decays of $X(3872)$

101 - Lu Meng , Guang-Juan Wang , Bo Wang 2021

In this work, we revisit the isospin violating decays of $X(3872)$ in a coupled-channel effective field theory. In the molecular scheme, the $X(3872)$ is interpreted as the bound state of $bar{D}^{*0}D^0/bar{D}^0D^{*0}$ and $D^{*-}D^+/D^-D^{*+}$ chan nels. In a cutoff-independent formalism, we relate the coupling constants of $X(3872)$ with the two channels to the molecular wave function. The isospin violating decays of $X(3872)$ are obtained by two equivalent approaches, which amend some deficiencies about this issue in literature. In the quantum field theory approach, the isospin violating decays arise from the coupling constants of $X(3872)$ to two di-meson channels. In the quantum mechanics approach, the isospin violating is attributed to wave functions at the origin. We illustrate that how to cure the insufficient results in literature. Within the comprehensive analysis, we bridge the isospin violating decays of $X(3872)$ to its inner structure. Our results show that the proportion of the neutral channel in $X(3872)$ is over $80%$. As a by-product, we calculate the strong decay width of $X(3872)to bar{D}^0 D^0pi^0$ and radiative one $X(3872)to bar{D}^0 D^0gamma$. The strong decay width and radiative decay width are about 30 keV and 10 keV, respectively, for the binding energy from $-300$ keV to $-50$ keV.

فيزياء الطاقة العالية - الظواهر فيزياء الطاقة العالية - التجربة فيزياء الطاقة العالية - شعرية

Near-field light-bending photonic switch: physics of switching based on three-dimensional Poynting vector analysis

127 - Liyang Yue , Zengbo Wang , Bing Yan 2021

Photonic hook is a high-intensity bent light focus with a proportional curvature to the wavelength of the incident light. Based on this unique light-bending phenomenon, a novel near-field photonic switch by means of a right-trapezoid dielectric Janus particle-lens embedded in the core of a planar waveguide is proposed for switching the photonic signals at two common optical communication wavelengths 1310 nm and 1550 nm by using numerical simulations. The signals at these two wavelengths can be guided to different routes according to their oppositely bent photonic hooks to realise wavelength selective switching. The switching mechanism is analysed by an in-house developed three-dimensional (3D) Poynting vector visualisation technology. It demonstrates that the 3D distribution and number of Poynting vector vortexes produced by the particle highly affect the shapes and bending directions of the photonic hooks causing the near-field switching, and multiple independent high-magnitude areas matched by the regional Poynting vector streamlines can form these photonic hooks. The corresponding mechanism can only be represented by 3D Poynting vector distributions and is being reported for the first time.

بصريات

Physiological-Physical Feature Fusion for Automatic Voice Spoofing Detection

110 - Junxiao Xue , Hao Zhou , Yabo Wang 2021

Speaker verification systems have been used in many production scenarios in recent years. Unfortunately, they are still highly prone to different kinds of spoofing attacks such as voice conversion and speech synthesis, etc. In this paper, we propose a new method base on physiological-physical feature fusion to deal with voice spoofing attacks. This method involves feature extraction, a densely connected convolutional neural network with squeeze and excitation block (SE-DenseNet), multi-scale residual neural network with squeeze and excitation block (SE-Res2Net) and feature fusion strategies. We first pre-trained a convolutional neural network using the speakers voice and face in the video as surveillance signals. It can extract physiological features from speech. Then we use SE-DenseNet and SE-Res2Net to extract physical features. Such a densely connection pattern has high parameter efficiency and squeeze and excitation block can enhance the transmission of the feature. Finally, we integrate the two features into the SE-Densenet to identify the spoofing attacks. Experimental results on the ASVspoof 2019 data set show that our model is effective for voice spoofing detection. In the logical access scenario, our model improves the tandem decision cost function (t-DCF) and equal error rate (EER) scores by 4% and 7%, respectively, compared with other methods. In the physical access scenario, our model improved t-DCF and EER scores by 8% and 10%, respectively.

معالجة الصوت والكلام أنظمة الصوت في الحاسوب معالجة الصور والفيديو

Reversed Strichartz estimates for wave on non-trapping asymptotically hyperbolic manifolds and applications

194 - Yannick Sire , Christopher D. Sogge , Chengbo Wang 2021

We provide reversed Strichartz estimates for the shifted wave equations on non-trapping asymptotically hyperbolic manifolds using cluster estimates for spectral projectors proved previously in such generality. As a consequence, we solve a problem lef t open in cite{SSWZ} about the endpoint case for global well-posedness of nonlinear wave equations. We also provide estimates in this context for the maximal wave operator.

تحليل PDES

Decentralized Power Allocation and Beamforming Using Non-Convex Nash Game for Energy-Aware mmWave Networks

120 - Wenbo Wang , Amir Leshem 2021

This paper focuses on the problem of joint beamforming control and power allocation in the ad-hoc mmWave network. Over the shared spectrum, a number of multi-input-multi-output links attempt to minimize their supply power by simultaneously finding th e locally optimal power allocation and beamformers in a self-interested manner. Our design considers a category of non-convex quality-of-service constraints, which are a function of the coupled strategies adopted by the mutually interfering ad-hoc links. We propose a two-stage, decentralized searching scheme, where the adaptation of power-levels and beamformer filters are performed in two separated sub-stages iteratively at each link. By introducing the analysis based on the generalized Nash equilibrium, we provide the theoretical proof of the convergence of our proposed power adaptation algorithm based on the local best response together with an iterative minimum mean square error receiver. Several transmit beamforming schemes requiring different levels of information exchange are compared. Our simulation results show that with a minimum-level requirement on the channel state information acquisition, a locally optimal transmit filter design based on the optimization of the local signal-to-interference-plus-noise ratio is able to achieve an acceptable tradeoff between link performance and the need for decentralization.

بنية الشبكات والإنترنت

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد