ترغب بنشر مسار تعليمي؟ اضغط هنا

Evolutionary neural architecture search (ENAS) has recently received increasing attention by effectively finding high-quality neural architectures, which however consumes high computational cost by training the architecture encoded by each individual for complete epochs in individual evaluation. Numerous ENAS approaches have been developed to reduce the evaluation cost, but it is often difficult for most of these approaches to achieve high evaluation accuracy. To address this issue, in this paper we propose an accelerated ENAS via multifidelity evaluation termed MFENAS, where the individual evaluation cost is significantly reduced by training the architecture encoded by each individual for only a small number of epochs. The balance between evaluation cost and evaluation accuracy is well maintained by suggesting a multi-fidelity evaluation, which identifies the potentially good individuals that cannot survive from previous generations by integrating multiple evaluations under different numbers of training epochs. For high diversity of neural architectures, a population initialization strategy is devised to produce different neural architectures varying from ResNet-like architectures to Inception-like ones. Experimental results on CIFAR-10 show that the architecture obtained by the proposed MFENAS achieves a 2.39% test error rate at the cost of only 0.6 GPU days on one NVIDIA 2080TI GPU, demonstrating the superiority of the proposed MFENAS over state-of-the-art NAS approaches in terms of both computational cost and architecture quality. The architecture obtained by the proposed MFENAS is then transferred to CIFAR-100 and ImageNet, which also exhibits competitive performance to the architectures obtained by existing NAS approaches. The source code of the proposed MFENAS is available at https://github.com/DevilYangS/MFENAS/.
Towards predicting patch correctness in APR, we propose a simple, but novel hypothesis on how the link between the patch behaviour and failing test specifications can be drawn: similar failing test cases should require similar patches. We then propos e BATS, an unsupervised learning-based system to predict patch correctness by checking patch Behaviour Against failing Test Specification. BATS exploits deep representation learning models for code and patches: for a given failing test case, the yielded embedding is used to compute similarity metrics in the search for historical similar test cases in order to identify the associated applied patches, which are then used as a proxy for assessing generated patch correctness. Experimentally, we first validate our hypothesis by assessing whether ground-truth developer patches cluster together in the same way that their associated failing test cases are clustered. Then, after collecting a large dataset of 1278 plausible patches (written by developers or generated by some 32 APR tools), we use BATS to predict correctness: BATS achieves an AUC between 0.557 to 0.718 and a recall between 0.562 and 0.854 in identifying correct patches. Compared against previous work, we demonstrate that our approach outperforms state-of-the-art performance in patch correctness prediction, without the need for large labeled patch datasets in contrast with prior machine learning-based approaches. While BATS is constrained by the availability of similar test cases, we show that it can still be complementary to existing approaches: used in conjunction with a recent approach implementing supervised learning, BATS improves the overall recall in detecting correct patches. We finally show that BATS can be complementary to the state-of-the-art PATCH-SIM dynamic approach of identifying the correct patches for APR tools.
101 - Ye Tian , Yang Feng 2021
In this work, we study the transfer learning problem under high-dimensional generalized linear models (GLMs), which aim to improve the fit on target data by borrowing information from useful source data. Given which sources to transfer, we propose an oracle algorithm and derive its $ell_2$-estimation error bounds. The theoretical analysis shows that under certain conditions, when the target and source are sufficiently close to each other, the estimation error bound could be improved over that of the classical penalized estimator using only target data. When we dont know which sources to transfer, an algorithm-free transferable source detection approach is introduced to detect informative sources. The detection consistency is proved under the high-dimensional GLM transfer learning setting. Extensive simulations and a real-data experiment verify the effectiveness of our algorithms.
310 - Ye Tian , Xingyi Zhang , Cheng He 2021
In the past three decades, a large number of metaheuristics have been proposed and shown high performance in solving complex optimization problems. While most variation operators in existing metaheuristics are empirically designed, this paper aims to design new operators automatically, which are expected to be search space independent and thus exhibit robust performance on different problems. For this purpose, this work first investigates the influence of translation invariance, scale invariance, and rotation invariance on the search behavior and performance of some representative operators. Then, we deduce the generic form of translation, scale, and rotation invariant operators. Afterwards, a principled approach is proposed for the automated design of operators, which searches for high-performance operators based on the deduced generic form. The experimental results demonstrate that the operators generated by the proposed approach outperform state-of-the-art ones on a variety of problems with complex landscapes and up to 1000 decision variables.
72 - Min Li , Yu Li , Ye Tian 2021
This paper presents AppealNet, a novel edge/cloud collaborative architecture that runs deep learning (DL) tasks more efficiently than state-of-the-art solutions. For a given input, AppealNet accurately predicts on-the-fly whether it can be successful ly processed by the DL model deployed on the resource-constrained edge device, and if not, appeals to the more powerful DL model deployed at the cloud. This is achieved by employing a two-head neural network architecture that explicitly takes inference difficulty into consideration and optimizes the tradeoff between accuracy and computation/communication cost of the edge/cloud collaborative architecture. Experimental results on several image classification datasets show up to more than 40% energy savings compared to existing techniques without sacrificing accuracy.
Warfarin, a commonly prescribed drug to prevent blood clots, has a highly variable individual response. Determining a maintenance warfarin dose that achieves a therapeutic blood clotting time, as measured by the international normalized ratio (INR), is crucial in preventing complications. Machine learning algorithms are increasingly being used for warfarin dosing; usually, an initial dose is predicted with clinical and genotype factors, and this dose is revised after a few days based on previous doses and current INR. Since a sequence of prior doses and INR better capture the variability in individual warfarin response, we hypothesized that longitudinal dose response data will improve maintenance dose predictions. To test this hypothesis, we analyzed a dataset from the COAG warfarin dosing study, which includes clinical data, warfarin doses and INR measurements over the study period, and maintenance dose when therapeutic INR was achieved. Various machine learning regression models to predict maintenance warfarin dose were trained with clinical factors and dosing history and INR data as features. Overall, dose revision algorithms with a single dose and INR achieved comparable performance as the baseline dose revision algorithm. In contrast, dose revision algorithms with longitudinal dose and INR data provided maintenance dose predictions that were statistically significantly much closer to the true maintenance dose. Focusing on the best performing model, gradient boosting (GB), the proportion of ideal estimated dose, i.e., defined as within $pm$20% of the true dose, increased from the baseline (54.92%) to the GB model with the single (63.11%) and longitudinal (75.41%) INR. More accurate maintenance dose predictions with longitudinal dose response data can potentially achieve therapeutic INR faster, reduce drug-related complications and improve patient outcomes with warfarin therapy.
A large body of the literature of automated program repair develops approaches where patches are generated to be validated against an oracle (e.g., a test suite). Because such an oracle can be imperfect, the generated patches, although validated by t he oracle, may actually be incorrect. While the state of the art explore research directions that require dynamic information or rely on manually-crafted heuristics, we study the benefit of learning code representations to learn deep features that may encode the properties of patch correctness. Our work mainly investigates different representation learning approaches for code changes to derive embeddings that are amenable to similarity computations. We report on findings based on embeddings produced by pre-trained and re-trained neural networks. Experimental results demonstrate the potential of embeddings to empower learning algorithms in reasoning about patch correctness: a machine learning predictor with BERT transformer-based embeddings associated with logistic regression yielded an AUC value of about 0.8 in predicting patch correctness on a deduplicated dataset of 1000 labeled patches. Our study shows that learned representations can lead to reasonable performance when comparing against the state-of-the-art, PATCH-SIM, which relies on dynamic information. These representations may further be complementary to features that were carefully (manually) engineered in the literature.
This paper focuses on Semi-Supervised Object Detection (SSOD). Knowledge Distillation (KD) has been widely used for semi-supervised image classification. However, adapting these methods for SSOD has the following obstacles. (1) The teacher model serv es a dual role as a teacher and a student, such that the teacher predictions on unlabeled images may be very close to those of student, which limits the upper-bound of the student. (2) The class imbalance issue in SSOD hinders an efficient knowledge transfer from teacher to student. To address these problems, we propose a novel method Temporal Self-Ensembling Teacher (TSE-T) for SSOD. Differently from previous KD based methods, we devise a temporally evolved teacher model. First, our teacher model ensembles its temporal predictions for unlabeled images under stochastic perturbations. Second, our teacher model ensembles its temporal model weights with the student model weights by an exponential moving average (EMA) which allows the teacher gradually learn from the student. These self-ensembling strategies increase data and model diversity, thus improving teacher predictions on unlabeled images. Finally, we use focal loss to formulate consistency regularization term to handle the data imbalance problem, which is a more efficient manner to utilize the useful information from unlabeled images than a simple hard-thresholding method which solely preserves confident predictions. Evaluated on the widely used VOC and COCO benchmarks, the mAP of our method has achieved 80.73% and 40.52% on the VOC2007 test set and the COCO2014 minval5k set respectively, which outperforms a strong fully-supervised detector by 2.37% and 1.49%. Furthermore, our method sets the new state-of-the-art in SSOD on VOC2007 test set which outperforms the baseline SSOD method by 1.44%. The source code of this work is publicly available at http://github.com/syangdong/tse-t.
63 - Chen Sun , Ye Tian , Liang Gao 2019
Calibration models have been developed for determination of trace elements, silver for instance, in soil using laser-induced breakdown spectroscopy (LIBS). The major concern is the matrix effect. Although it affects the accuracy of LIBS measurements in a general way, the effect appears accentuated for soil because of large variation of chemical and physical properties among different soils. The purpose is to reduce its influence in such way an accurate and soil-independent calibration model can be constructed. At the same time, the developed model should efficiently reduce experimental fluctuations affecting measurement precision. A univariate model first reveals obvious influence of matrix effect and important experimental fluctuation. A multivariate model has been then developed. A key point is the introduction of generalized spectrum where variables representing the soil type are explicitly included. Machine learning has been used to develop the model. After a necessary pretreatment where a feature selection process reduces the dimension of raw spectrum accordingly to the number of available spectra, the data have been fed in to a back-propagation neuronal networks (BPNN) to train and validate the model. The resulted soilindependent calibration model allows average relative error of calibration (REC) and average relative error of prediction (REP) within the range of 5-6%.
Over the last three decades, a large number of evolutionary algorithms have been developed for solving multiobjective optimization problems. However, there lacks an up-to-date and comprehensive software platform for researchers to properly benchmark existing algorithms and for practitioners to apply selected algorithms to solve their real-world problems. The demand of such a common tool becomes even more urgent, when the source code of many proposed algorithms has not been made publicly available. To address these issues, we have developed a MATLAB platform for evolutionary multi-objective optimization in this paper, called PlatEMO, which includes more than 50 multi-objective evolutionary algorithms and more than 100 multi-objective test problems, along with several widely used performance indicators. With a user-friendly graphical user interface, PlatEMO enables users to easily compare several evolutionary algorithms at one time and collect statistical results in Excel or LaTeX files. More importantly, PlatEMO is completely open source, such that users are able to develop new algorithms on the basis of it. This paper introduces the main features of PlatEMO and illustrates how to use it for performing comparative experiments, embedding new algorithms, creating new test problems, and developing performance indicators. Source code of PlatEMO is now available at: http://bimk.ahu.edu.cn/index.php?s=/Index/Software/index.html.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا