Name disambiguation is a key and also a very tough problem in many online systems such as social search and academic search. Despite considerable research, a critical issue that has not been systematically studied is disambiguation on the fly -- to complete the disambiguation in the real-time. This is very challenging, as the disambiguation algorithm must be accurate, efficient, and error tolerance. In this paper, we propose a novel framework -- CONNA -- to train a matching component and a decision component jointly via reinforcement learning. The matching component is responsible for finding the top matched candidate for the given paper, and the decision component is responsible for deciding on assigning the top matched person or creating a new person. The two components are intertwined and can be bootstrapped via jointly training. Empirically, we evaluate CONNA on two name disambiguation datasets. Experimental results show that the proposed framework can achieve a 1.21%-19.84% improvement on F1-score using joint training of the matching and the decision components. The proposed CONNA has been successfully deployed on AMiner -- a large online academic search system.