أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Yi Guo

Policy Optimization Using Semiparametric Models for Dynamic Pricing

130 - Jianqing Fan , Yongyi Guo , Mengxin Yu 2021

In this paper, we study the contextual dynamic pricing problem where the market value of a product is linear in its observed features plus some market noise. Products are sold one at a time, and only a binary response indicating success or failure of a sale is observed. Our model setting is similar to Javanmard and Nazerzadeh [2019] except that we expand the demand curve to a semiparametric model and need to learn dynamically both parametric and nonparametric components. We propose a dynamic statistical learning and decision-making policy that combines semiparametric estimation from a generalized linear model with an unknown link and online decision-making to minimize regret (maximize revenue). Under mild conditions, we show that for a market noise c.d.f. $F(cdot)$ with $m$-th order derivative ($mgeq 2$), our policy achieves a regret upper bound of $tilde{O}_{d}(T^{frac{2m+1}{4m-1}})$, where $T$ is time horizon and $tilde{O}_{d}$ is the order that hides logarithmic terms and the dimensionality of feature $d$. The upper bound is further reduced to $tilde{O}_{d}(sqrt{T})$ if $F$ is super smooth whose Fourier transform decays exponentially. In terms of dependence on the horizon $T$, these upper bounds are close to $Omega(sqrt{T})$, the lower bound where $F$ belongs to a parametric class. We further generalize these results to the case with dynamically dependent product features under the strong mixing condition.

التعلم الآلي الاقتصاد القياسي التحسين والتحكم

GDP: Stabilized Neural Network Pruning via Gates with Differentiable Polarization

93 - Yi Guo , Huan Yuan , Jianchao Tan 2021

Model compression techniques are recently gaining explosive attention for obtaining efficient AI models for various real-time applications. Channel pruning is one important compression strategy and is widely used in slimming various DNNs. Previous ga te-based or importance-based pruning methods aim to remove channels whose importance is smallest. However, it remains unclear what criteria the channel importance should be measured on, leading to various channel selection heuristics. Some other sampling-based pruning methods deploy sampling strategies to train sub-nets, which often causes the training instability and the compressed models degraded performance. In view of the research gaps, we present a new module named Gates with Differentiable Polarization (GDP), inspired by principled optimization ideas. GDP can be plugged before convolutional layers without bells and whistles, to control the on-and-off of each channel or whole layer block. During the training process, the polarization effect will drive a subset of gates to smoothly decrease to exact zero, while other gates gradually stay away from zero by a large margin. When training terminates, those zero-gated channels can be painlessly removed, while other non-zero gates can be absorbed into the succeeding convolution kernel, causing completely no interruption to training nor damage to the trained model. Experiments conducted over CIFAR-10 and ImageNet datasets show that the proposed GDP algorithm achieves the state-of-the-art performance on various benchmark DNNs at a broad range of pruning ratios. We also apply GDP to DeepLabV3Plus-ResNet50 on the challenging Pascal VOC segmentation task, whose test performance sees no drop (even slightly improved) with over 60% FLOPs saving.

الرؤية الحاسوبية وتمييز الأنماط

Pagurus: Eliminating Cold Startup in Serverless Computing with Inter-Action Container Sharing

90 - Zijun Li , Quan Chen , Minyi Guo 2021

Serverless computing provides fine-grain resource sharing between Cloud tenants through containers. Each function invocation (action) runs in an individual container. When there is not an already started container for a user function, a new container has to be created for it. However, the long cold startup time of a container results in the long response latency of the action. Our investigation shows that the containers for some user actions share most of the software packages. If an action that requires a new container can ``borrow a similar warm container from other actions, the long cold startup can be eliminated. Based on the above finding, we propose Pagurus, a runtime container management system for eliminating the cold startup in serverless computing. Pagurus is comprised of an inter-action container scheduler and an intra-action container scheduler for each action. The inter-action container scheduler schedules shared containers among actions. The intra-action container scheduler deals with the management of the container lifecycle. Our experimental results show that Pagurus effectively eliminates the time-consuming container cold startup. An action may start to run in 10ms with Pagurus, even if there is not warm container for it.

النظم الموزعة والتوازية والحوسبة العنقودية

Clinical Relation Extraction Using Transformer-based Models

109 - Xi Yang , Zehao Yu , Yi Guo 2021

The newly emerged transformer technology has a tremendous impact on NLP research. In the general English domain, transformer-based models have achieved state-of-the-art performances on various NLP benchmarks. In the clinical domain, researchers also have investigated transformer models for clinical applications. The goal of this study is to systematically explore three widely used transformer-based models (i.e., BERT, RoBERTa, and XLNet) for clinical relation extraction and develop an open-source package with clinical pre-trained transformer-based models to facilitate information extraction in the clinical domain. We developed a series of clinical RE models based on three transformer architectures, namely BERT, RoBERTa, and XLNet. We evaluated these models using 2 publicly available datasets from 2018 MADE1.0 and 2018 n2c2 challenges. We compared two classification strategies (binary vs. multi-class classification) and investigated two approaches to generate candidate relations in different experimental settings. In this study, we compared three transformer-based (BERT, RoBERTa, and XLNet) models for relation extraction. We demonstrated that the RoBERTa-clinical RE model achieved the best performance on the 2018 MADE1.0 dataset with an F1-score of 0.8958. On the 2018 n2c2 dataset, the XLNet-clinical model achieved the best F1-score of 0.9610. Our results indicated that the binary classification strategy consistently outperformed the multi-class classification strategy for clinical relation extraction. Our methods and models are publicly available at https://github.com/uf-hobi-informatics-lab/ClinicalTransformerRelationExtraction. We believe this work will improve current practice on clinical relation extraction and other related NLP tasks in the biomedical domain.

الحساب واللغة استرجاع المعلومات التعلم الآلي

Thermal variational quantum simulation on a superconducting quantum processor

232 - Xue-Yi Guo , Shang-Shu Li , Xiao Xiao 2021

Solving finite-temperature properties of quantum many-body systems is generally challenging to classical computers due to their high computational complexities. In this article, we present experiments to demonstrate a hybrid quantum-classical simulat ion of thermal quantum states. By combining a classical probabilistic model and a 5-qubit programmable superconducting quantum processor, we prepare Gibbs states and excited states of Heisenberg XY and XXZ models with high fidelity and compute thermal properties including the variational free energy, energy, and entropy with a small statistical error. Our approach combines the advantage of classical probabilistic models for sampling and quantum co-processors for unitary transformations. We show that the approach is scalable in the number of qubits, and has a self-verifiable feature, revealing its potentials in solving large-scale quantum statistical mechanics problems on near-term intermediate-scale quantum computers.

فيزياء الكم غازات الكم الميكانيكا الإحصائية

Machine Learning for Variance Reduction in Online Experiments

195 - Yongyi Guo , Dominic Coey , Mikael Konutgan 2021

We consider the problem of variance reduction in randomized controlled trials, through the use of covariates correlated with the outcome but independent of the treatment. We propose a machine learning regression-adjusted treatment effect estimator, w hich we call MLRATE. MLRATE uses machine learning predictors of the outcome to reduce estimator variance. It employs cross-fitting to avoid overfitting biases, and we prove consistency and asymptotic normality under general conditions. MLRATE is robust to poor predictions from the machine learning step: if the predictions are uncorrelated with the outcomes, the estimator performs asymptotically no worse than the standard difference-in-means estimator, while if predictions are highly correlated with outcomes, the efficiency gains are large. In A/A tests, for a set of 48 outcome metrics commonly monitored in Facebook experiments the estimator has over 70% lower variance than the simple difference-in-means estimator, and about 19% lower variance than the common univariate procedure which adjusts only for pre-experiment values of the outcome.

التعلم الالي التعلم الآلي

A Discussion On the Validity of Manifold Learning

59 - Dai Shi , Andi Han , Yi Guo 2021

Dimensionality reduction (DR) and manifold learning (ManL) have been applied extensively in many machine learning tasks, including signal processing, speech recognition, and neuroinformatics. However, the understanding of whether DR and ManL models c an generate valid learning results remains unclear. In this work, we investigate the validity of learning results of some widely used DR and ManL methods through the chart mapping function of a manifold. We identify a fundamental problem of these methods: the mapping functions induced by these methods violate the basic settings of manifolds, and hence they are not learning manifold in the mathematical sense. To address this problem, we provide a provably correct algorithm called fixed points Laplacian mapping (FPLM), that has the geometric guarantee to find a valid manifold representation (up to a homeomorphism). Combining one additional condition(orientation preserving), we discuss a sufficient condition for an algorithm to be bijective for any d-simplex decomposition result on a d-manifold. However, constructing such a mapping function and its computational method satisfying these conditions is still an open problem in mathematics.

التعلم الآلي

We Know What You Want: An Advertising Strategy Recommender System for Online Advertising

151 - Liyi Guo , Junqi Jin , Haoqi Zhang 2021

Advertising expenditures have become the major source of revenue for e-commerce platforms. Providing good advertising experiences for advertisers by reducing their costs of trial and error in discovering the optimal advertising strategies is crucial for the long-term prosperity of online advertising. To achieve this goal, the advertising platform needs to identify the advertisers optimization objectives, and then recommend the corresponding strategies to fulfill the objectives. In this work, we first deploy a prototype of strategy recommender system on Taobao display advertising platform, which indeed increases the advertisers performance and the platforms revenue, indicating the effectiveness of strategy recommendation for online advertising. We further augment this prototype system by explicitly learning the advertisers preferences over various advertising performance indicators and then optimization objectives through their adoptions of different recommending advertising strategies. We use contextual bandit algorithms to efficiently learn the advertisers preferences and maximize the recommendation adoption, simultaneously. Simulation experiments based on Taobao online bidding data show that the designed algorithms can effectively optimize the strategy adoption rate of advertisers.

استرجاع المعلومات التعلم الآلي

XCloud-pFISTA: A Medical Intelligence Cloud for Accelerated MRI

97 - Yirong Zhou , Chen Qian , Yi Guo 2021

Machine learning and artificial intelligence have shown remarkable performance in accelerated magnetic resonance imaging (MRI). Cloud computing technologies have great advantages in building an easily accessible platform to deploy advanced algorithms . In this work, we develop an open-access, easy-to-use and high-performance medical intelligence cloud computing platform (XCloud-pFISTA) to reconstruct MRI images from undersampled k-space data. Two state-of-the-art approaches of the Projected Fast Iterative Soft-Thresholding Algorithm (pFISTA) family have been successfully implemented on the cloud. This work can be considered as a good example of cloud-based medical image reconstruction and may benefit the future development of integrated reconstruction and online diagnosis system.

معالجة الصور والفيديو

XCloud-VIP: Virtual Peak Enables Highly Accelerated NMR Spectroscopy and Faithful Quantitative Measures

79 - Di Guo , Zhangren Tu , Yi Guo 2021

Background: Nuclear Magnetic Resonance (NMR) spectroscopy is an important bio-engineering tool to determine the metabolic concentrations, molecule structures and so on. The data acquisition time, however, is very long in multi-dimensional NMR. To acc elerate data acquisition, non-uniformly sampling is an effective way but may encounter severe spectral distortions and unfaithful quantitative measures when the acceleration factor is high. Objective: To reconstruct high fidelity spectra from highly accelerated NMR and achieve much better quantitative measures. Methods: A virtual peak (VIP) approach is proposed to self-learn the prior spectral information, such as the central frequency and peak lineshape, and then feed these information into the reconstruction. The proposed method is further implemented with cloud computing to facilitate online, open, and easy access. Results: Results on synthetic and experimental data demonstrate that, compared with the state-of-the-art method, the new approach provides much better reconstruction of low-intensity peaks and significantly improves the quantitative measures, including the regression of peak intensity, the distances between nuclear pairs, and concentrations of metabolics in mixtures. Conclusion: Self-learning prior peak information can improve the reconstruction and quantitative measures of spectra. Significance: This approach enables highly accelerated NMR and may promote time-consuming applications such as quantitative and time-resolved NMR experiments.

الفيزياء الطبية

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد