No Arabic abstract
We study a budgeted hyper-parameter tuning problem, where we optimize the tuning result under a hard resource constraint. We propose to solve it as a sequential decision making problem, such that we can use the partial training progress of configurations to dynamically allocate the remaining budget. Our algorithm combines a Bayesian belief model which estimates the future performance of configurations, with an action-value function which balances exploration-exploitation tradeoff, to optimize the final output. It automatically adapts the tuning behaviors to different constraints, which is useful in practice. Experiment results demonstrate superior performance over existing algorithms, including the-state-of-the-art one, on real-world tuning tasks across a range of different budgets.
As neural networks are increasingly employed in machine learning practice, how to efficiently share limited training resources among a diverse set of model training tasks becomes a crucial issue. To achieve better utilization of the shared resources, we explore the idea of jointly training multiple neural network models on a single GPU in this paper. We realize this idea by proposing a primitive, called pack. We further present a comprehensive empirical study of pack and end-to-end experiments that suggest significant improvements for hyperparameter tuning. The results suggest: (1) packing two models can bring up to 40% performance improvement over unpacked setups for a single training step and the improvement increases when packing more models; (2) the benefit of the pack primitive largely depends on a number of factors including memory capacity, chip architecture, neural network structure, and batch size; (3) there exists a trade-off between packing and unpacking when training multiple neural network models on limited resources; (4) a pack-aware Hyperband is up to 2.7x faster than the original Hyperband, with this improvement growing as memory size increases and subsequently the density of models packed.
Machine learning is a powerful method for modeling in different fields such as education. Its capability to accurately predict students success makes it an ideal tool for decision-making tasks related to higher education. The accuracy of machine learning models depends on selecting the proper hyper-parameters. However, it is not an easy task because it requires time and expertise to tune the hyper-parameters to fit the machine learning model. In this paper, we examine the effectiveness of automated hyper-parameter tuning techniques to the realm of students success. Therefore, we develop two automated Hyper-Parameter Optimization methods, namely grid search and random search, to assess and improve a previous studys performance. The experiment results show that applying random search and grid search on machine learning algorithms improves accuracy. We empirically show automated methods superiority on real-world educational data (MIDFIELD) for tuning HPs of conventional machine learning classifiers. This work emphasizes the effectiveness of automated hyper-parameter optimization while applying machine learning in the education field to aid faculties, directors, or non-expert users decisions to improve students success.
This paper proposes the first-ever algorithmic framework for tuning hyper-parameters of stochastic optimization algorithm based on reinforcement learning. Hyper-parameters impose significant influences on the performance of stochastic optimization algorithms, such as evolutionary algorithms (EAs) and meta-heuristics. Yet, it is very time-consuming to determine optimal hyper-parameters due to the stochastic nature of these algorithms. We propose to model the tuning procedure as a Markov decision process, and resort the policy gradient algorithm to tune the hyper-parameters. Experiments on tuning stochastic algorithms with different kinds of hyper-parameters (continuous and discrete) for different optimization problems (continuous and discrete) show that the proposed hyper-parameter tuning algorithms do not require much less running times of the stochastic algorithms than bayesian optimization method. The proposed framework can be used as a standard tool for hyper-parameter tuning in stochastic algorithms.
In the context of deep learning, the costliest phase from a computational point of view is the full training of the learning algorithm. However, this process is to be used a significant number of times during the design of a new artificial neural network, leading therefore to extremely expensive operations. Here, we propose a low-cost strategy to predict the accuracy of the algorithm, based only on its initial behaviour. To do so, we train the network of interest up to convergence several times, modifying its characteristics at each training. The initial and final accuracies observed during this beforehand process are stored in a database. We then make use of both curve fitting and Support Vector Machines techniques, the latter being trained on the created database, to predict the accuracy of the network, given its accuracy on the primary iterations of its learning. This approach can be of particular interest when the space of the characteristics of the network is notably large or when its full training is highly time-consuming. The results we obtained are promising and encouraged us to apply this strategy to a topical issue: hyper-parameter optimisation (HO). In particular, we focused on the HO of a convolutional neural network for the classification of the databases MNIST and CIFAR-10. By using our method of prediction, and an algorithm implemented by us for a probabilistic exploration of the hyper-parameter space, we were able to find the hyper-parameter settings corresponding to the optimal accuracies already known in literature, at a quite low-cost.
Machine learning techniques lend themselves as promising decision-making and analytic tools in a wide range of applications. Different ML algorithms have various hyper-parameters. In order to tailor an ML model towards a specific application, a large number of hyper-parameters should be tuned. Tuning the hyper-parameters directly affects the performance (accuracy and run-time). However, for large-scale search spaces, efficiently exploring the ample number of combinations of hyper-parameters is computationally challenging. Existing automated hyper-parameter tuning techniques suffer from high time complexity. In this paper, we propose HyP-ABC, an automatic innovative hybrid hyper-parameter optimization algorithm using the modified artificial bee colony approach, to measure the classification accuracy of three ML algorithms, namely random forest, extreme gradient boosting, and support vector machine. Compared to the state-of-the-art techniques, HyP-ABC is more efficient and has a limited number of parameters to be tuned, making it worthwhile for real-world hyper-parameter optimization problems. We further compare our proposed HyP-ABC algorithm with state-of-the-art techniques. In order to ensure the robustness of the proposed method, the algorithm takes a wide range of feasible hyper-parameter values, and is tested using a real-world educational dataset.