A crucial challenge to successful flare prediction is forecasting periods that transition between flare-quiet and flare-active. Building on earlier studies in this series (Barnes et al. 2016; Leka et al. 2019a,b) in which we describe methodology, details, and results of flare forecasting comparison efforts, we focus here on patterns of forecast outcomes (success and failure) over multi-day periods. A novel analysis is developed to evaluate forecasting success in the context of catching the first event of flare-active periods, and conversely, of correctly predicting declining flare activity. We demonstrate these evaluation methods graphically and quantitatively as they provide both quick comparative evaluations and options for detailed analysis. For the testing interval 2016-2017, we determine the relative frequency distribution of two-day dichotomous forecast outcomes for three different event histories (i.e., event/event, no-event/event and event/no-event), and use it to highlight performance differences between forecasting methods. A trend is identified across all forecasting methods that a high/low forecast probability on day-1 remains high/low on day-2 even though flaring activity is transitioning. For M-class and larger flares, we find that explicitly including persistence or prior flare history in computing forecasts helps to improve overall forecast performance. It is also found that using magnetic/modern data leads to improvement in catching the first-event/first-no-event transitions. Finally, 15% of major (i.e., M-class or above) flare days over the testing interval were effectively missed due to a lack of observations from instruments away from the Earth-Sun line.