Focal plane wavefront sensing (FPWFS) is appealing for several reasons. Notably, it offers high sensitivity and does not suffer from non-common path aberrations (NCPA). The price to pay is a high computational burden and the need for diversity to lift any phase ambiguity. If those limitations can be overcome, FPWFS is a great solution for NCPA measurement, a key limitation for high-contrast imaging, and could be used as adaptive optics wavefront sensor. Here, we propose to use deep convolutional neural networks (CNNs) to measure NCPA based on focal plane images. Two CNN architectures are considered: ResNet-50 and U-Net which are used respectively to estimate Zernike coefficients or directly the phase. The models are trained on labelled datasets and evaluated at various flux levels and for two spatial frequency contents (20 and 100 Zernike modes). In these idealized simulations we demonstrate that the CNN-based models reach the photon noise limit in a large range of conditions. We show, for example, that the root mean squared (rms) wavefront error (WFE) can be reduced to < $lambda$/1500 for $2 times 10^6$ photons in one iteration when estimating 20 Zernike modes. We also show that CNN-based models are sufficiently robust to varying signal-to-noise ratio, under the presence of higher-order aberrations, and under different amplitudes of aberrations. Additionally, they display similar to superior performance compared to iterative phase retrieval algorithms. CNNs therefore represent a compelling way to implement FPWFS, which can leverage the high sensitivity of FPWFS over a broad range of conditions.