Causal inference of treatment effects is a challenging undertaking in it of itself; inference for sequential treatments leads to even more hurdles. In precision medicine, one additional ambitious goal may be to infer about effects of dynamic treatment regimes (DTRs) and to identify optimal DTRs. Conventional methods for inferring about DTRs involve powerful semi-parametric estimators. However, these are not without their strong assumptions. Dynamic Marginal Structural Models (MSMs) are one semi-parametric approach used to infer about optimal DTRs in a family of regimes. To achieve this, investigators are forced to model the expected outcome under adherence to a DTR in the family; relatively straightforward models may lead to bias in the optimum. One way to obviate this difficulty is to perform a grid search for the optimal DTR. Unfortunately, this approach becomes prohibitive as the complexity of regimes considered increases. In recently developed Bayesian methods for dynamic MSMs, computational challenges may be compounded by the fact that at each grid point, a posterior mean must be calculated. We propose a manner by which to alleviate modelling difficulties for DTRs by using Gaussian process optimization. More precisely, we show how to pair this optimization approach with robust estimators for the causal effect of adherence to a DTR to identify optimal DTRs. We examine how to find the optimum in complex, multi-modal settings which are not generally addressed in the DTR literature. We further evaluate the sensitivity of the approach to a variety of modeling assumptions in the Gaussian process.