Numerical approximation methods for the Koopman operator have advanced considerably in the last few years. In particular, data-driven approaches such as dynamic mode decomposition (DMD) and its generalization, the extended-DMD (EDMD), are becoming increasingly popular in practical applications. The EDMD improves upon the classical DMD by the inclusion of a flexible choice of dictionary of observables that spans a finite dimensional subspace on which the Koopman operator can be approximated. This enhances the accuracy of the solution reconstruction and broadens the applicability of the Koopman formalism. Although the convergence of the EDMD has been established, applying the method in practice requires a careful choice of the observables to improve convergence with just a finite number of terms. This is especially difficult for high dimensional and highly nonlinear systems. In this paper, we employ ideas from machine learning to improve upon the EDMD method. We develop an iterative approximation algorithm which couples the EDMD with a trainable dictionary represented by an artificial neural network. Using the Duffing oscillator and the Kuramoto Sivashinsky PDE as examples, we show that our algorithm can effectively and efficiently adapt the trainable dictionary to the problem at hand to achieve good reconstruction accuracy without the need to choose a fixed dictionary a priori. Furthermore, to obtain a given accuracy we require fewer dictionary terms than EDMD with fixed dictionaries. This alleviates an important shortcoming of the EDMD algorithm and enhances the applicability of the Koopman framework to practical problems.