Using deep neural networks to solve PDEs has attracted a lot of attentions recently. However, why the deep learning method works is falling far behind its empirical success. In this paper, we provide a rigorous numerical analysis on deep Ritz method (DRM) cite{wan11} for second order elliptic equations with Neumann boundary conditions. We establish the first nonasymptotic convergence rate in $H^1$ norm for DRM using deep networks with $mathrm{ReLU}^2$ activation functions. In addition to providing a theoretical justification of DRM, our study also shed light on how to set the hyper-parameter of depth and width to achieve the desired convergence rate in terms of number of training samples. Technically, we derive bounds on the approximation error of deep $mathrm{ReLU}^2$ network in $H^1$ norm and on the Rademacher complexity of the non-Lipschitz composition of gradient norm and $mathrm{ReLU}^2$ network, both of which are of independent interest.