We present AstroVaDEr, a variational autoencoder designed to perform unsupervised clustering and synthetic image generation using astronomical imaging catalogues. The model is a convolutional neural network that learns to embed images into a low dimensional latent space, and simultaneously optimises a Gaussian Mixture Model (GMM) on the embedded vectors to cluster the training data. By utilising variational inference, we are able to use the learned GMM as a statistical prior on the latent space to facilitate random sampling and generation of synthetic images. We demonstrate AstroVaDErs capabilities by training it on gray-scaled textit{gri} images from the Sloan Digital Sky Survey, using a sample of galaxies that are classified by Galaxy Zoo 2. An unsupervised clustering model is found which separates galaxies based on learned morphological features such as axis ratio, surface brightness profile, orientation and the presence of companions. We use the learned mixture model to generate synthetic images of galaxies based on the morphological profiles of the Gaussian components. AstroVaDEr succeeds in producing a morphological classification scheme from unlabelled data, but unexpectedly places high importance on the presence of companion objects---demonstrating the importance of human interpretation. The network is scalable and flexible, allowing for larger datasets to be classified, or different kinds of imaging data. We also demonstrate the generative properties of the model, which allow for realistic synthetic images of galaxies to be sampled from the learned classification scheme. These can be used to create synthetic image catalogs or to perform image processing tasks such as deblending.