This work theoretically investigates the performance of a composite neural network. A composite neural network is a rooted directed acyclic graph combining a set of pre-trained and non-instantiated neural network models, where a pre-trained neural network model is well-crafted for a specific task and targeted to approximate a specific function with instantiated weights. The advantages of adopting such a pre-trained model in a composite neural network are two folds. One is to benefit from others intelligence and diligence, and the other is saving the efforts in data preparation and resources and time in training. However, the overall performance of composite neural network is still not clear. In this work, we prove that a composite neural network, with high probability, performs better than any of its pre-trained components under certain assumptions. In addition, if an extra pre-trained component is added to a composite network, with high probability the overall performance will be improved. In the empirical evaluations, distinctively different applications support the above findings.