We explore the effectiveness of deep learning convolutional neural networks (CNNs) for estimating strong gravitational lens mass model parameters. We have investigated a number of practicalities faced when modelling real image data, such as how network performance depends on the inclusion of lens galaxy light, the addition of colour information and varying signal-to-noise. Our CNN was trained and tested with strong galaxy-galaxy lens images simulated to match the imaging characteristics of the Large Synoptic Survey Telescope (LSST) and Euclid. For images including lens galaxy light, the CNN can recover the lens model parameters with an acceptable accuracy, although a 34 per cent average improvement in accuracy is obtained when lens light is removed. However, the inclusion of colour information can largely compensate for the drop in accuracy resulting from the presence of lens light. While our findings show similar accuracies for single epoch Euclid VIS and LSST r-band datasets, we find a 24 per cent increase in accuracy by adding g- and i-band images to the LSST r-band without lens light and a 20 per cent increase with lens light. The best network performance is obtained when it is trained and tested on images where lens light exactly follows the mass, but when orientation and ellipticity of the light is allowed to differ from those of the mass, the network performs most consistently when trained with a moderate amount of scatter in the difference between the mass and light profiles.