Quantum compiling aims to construct a quantum circuit V by quantum gates drawn from a native gate alphabet, which is functionally equivalent to the target unitary U. It is a crucial stage for the running of quantum algorithms on noisy intermediate-scale quantum (NISQ) devices. However, the space for structure exploration of quantum circuit is enormous, resulting in the requirement of human expertise, hundreds of experimentations or modifications from existing quantum circuits. In this paper, we propose a variational quantum compiling (VQC) algorithm based on reinforcement learning (RL), in order to automatically design the structure of quantum circuit for VQC with no human intervention. An agent is trained to sequentially select quantum gates from the native gate alphabet and the qubits they act on by double Q-learning with epsilon-greedy exploration strategy and experience replay. At first, the agent randomly explores a number of quantum circuits with different structures, and then iteratively discovers structures with higher performance on the learning task. Simulation results show that the proposed method can make exact compilations with less quantum gates compared to previous VQC algorithms. It can reduce the errors of quantum algorithms due to decoherence process and gate noise in NISQ devices, and enable quantum algorithms especially for complex algorithms to be executed within coherence time.