Recent advances in quantum computing have drawn considerable attention to building realistic application for and using quantum computers. However, designing a suitable quantum circuit architecture requires expert knowledge. For example, it is non-trivial to design a quantum gate sequence for generating a particular quantum state with as fewer gates as possible. We propose a quantum architecture search framework with the power of deep reinforcement learning (DRL) to address this challenge. In the proposed framework, the DRL agent can only access the Pauli-$X$, $Y$, $Z$ expectation values and a predefined set of quantum operations for learning the target quantum state, and is optimized by the advantage actor-critic (A2C) and proximal policy optimization (PPO) algorithms. We demonstrate a successful generation of quantum gate sequences for multi-qubit GHZ states without encoding any knowledge of quantum physics in the agent. The design of our framework is rather general and can be employed with other DRL architectures or optimization methods to study gate synthesis and compilation for many quantum states.