ﻻ يوجد ملخص باللغة العربية
Fuzzing is becoming more and more popular in the field of vulnerability detection. In the process of fuzzing, seed selection strategy plays an important role in guiding the evolution direction of fuzzing. However, the SOTA fuzzers only focus on individual uncertainty, neglecting the multi-factor uncertainty caused by both randomization and evolution. In this paper, we consider seed selection in fuzzing as a large-scale online planning problem under uncertainty. We propose mytool which is a new intelligent seed selection strategy. In Alpha-Fuzz, we leverage the MCTS algorithm to deal with the effects of the uncertainty of randomization and evolution of fuzzing. Especially, we analyze the role of the evolutionary relationship between seeds in the process of fuzzing, and propose a new tree policy and a new default policy to make the MCTS algorithm better adapt to the fuzzing. We compared mytool with four state-of-the-art fuzzers in 12 real-world applications and LAVA-M data set. The experimental results show that mytool could find more bugs on lava-M and outperforms other tools in terms of code coverage and number of bugs discovered in the real-world applications. In addition, we tested the compatibility of mytool, and the results showed that mytool could improve the performance of existing tools such as MOPT and QSYM.
The combination of Monte-Carlo tree search (MCTS) with deep reinforcement learning has led to significant advances in artificial intelligence. However, AlphaZero, the current state-of-the-art MCTS algorithm, still relies on handcrafted heuristics tha
Many of the strongest game playing programs use a combination of Monte Carlo tree search (MCTS) and deep neural networks (DNN), where the DNNs are used as policy or value evaluators. Given a limited budget, such as online playing or during the self-p
We consider Monte-Carlo Tree Search (MCTS) applied to Markov Decision Processes (MDPs) and Partially Observable MDPs (POMDPs), and the well-known Upper Confidence bound for Trees (UCT) algorithm. In UCT, a tree with nodes (states) and edges (actions)
Recent advances in bandit tools and techniques for sequential learning are steadily enabling new applications and are promising the resolution of a range of challenging related problems. We study the game tree search problem, where the goal is to qui
Active Reinforcement Learning (ARL) is a twist on RL where the agent observes reward information only if it pays a cost. This subtle change makes exploration substantially more challenging. Powerful principles in RL like optimism, Thompson sampling,