ﻻ يوجد ملخص باللغة العربية
Neural architecture search (NAS) with an accuracy predictor that predicts the accuracy of candidate architectures has drawn increasing attention due to its simplicity and effectiveness. Previous works usually employ neural network-based predictors which require more delicate design and are easy to overfit. Considering that most architectures are represented as sequences of discrete symbols which are more like tabular data and preferred by non-neural predictors, in this paper, we study an alternative approach which uses non-neural model for accuracy prediction. Specifically, as decision tree based models can better handle tabular data, we leverage gradient boosting decision tree (GBDT) as the predictor for NAS. We demonstrate that the GBDT predictor can achieve comparable (if not better) prediction accuracy than neural network based predictors. Moreover, considering that a compact search space can ease the search process, we propose to prune the search space gradually according to important features derived from GBDT. In this way, NAS can be performed by first pruning the search space and then searching a neural architecture, which is more efficient and effective. Experiments on NASBench-101 and ImageNet demonstrate the effectiveness of using GBDT as predictor for NAS: (1) On NASBench-101, it is 22x, 8x, and 6x more sample efficient than random search, regularized evolution, and Monte Carlo Tree Search (MCTS) in finding the global optimum; (2) It achieves 24.2% top-1 error rate on ImageNet, and further achieves 23.4% top-1 error rate on ImageNet when enhanced with search space pruning. Code is provided at https://github.com/renqianluo/GBDT-NAS.
Graph neural networks (GNNs) have been successfully applied to learning representation on graphs in many relational tasks. Recently, researchers study neural architecture search (NAS) to reduce the dependence of human expertise and explore better GNN
Most existing neural architecture search (NAS) algorithms are dedicated to the downstream tasks, e.g., image classification in computer vision. However, extensive experiments have shown that, prominent neural architectures, such as ResNet in computer
In human learning, an effective learning methodology is small-group learning: a small group of students work together towards the same learning objective, where they express their understanding of a topic to their peers, compare their ideas, and help
Learning through tests is a broadly used methodology in human learning and shows great effectiveness in improving learning outcome: a sequence of tests are made with increasing levels of difficulty; the learner takes these tests to identify his/her w
Methods for neural network hyperparameter optimization and meta-modeling are computationally expensive due to the need to train a large number of model configurations. In this paper, we show that standard frequentist regression models can predict the