The cybersecurity of smart grids has become one of key problems in developing reliable modern power and energy systems. This paper introduces a non-stationary adversarial cost with a variation constraint for smart grids and enables us to investigate the problem of optimal smart grid protection against cyber attacks in a relatively practical scenario. In particular, a Bayesian multi-node bandit (MNB) model with adversarial costs is constructed and a new regret function is defined for this model. An algorithm called Thompson-Hedge algorithm is presented to solve the problem and the superior performance of the proposed algorithm is proven in terms of the convergence rate of the regret function. The applicability of the algorithm to real smart grid scenarios is verified and the performance of the algorithm is also demonstrated by numerical examples.