For an RF-powered cognitive radio network with ambient backscattering capability, while the primary channel is busy, the RF-powered secondary user (RSU) can either backscatter the primary signal to transmit its own data or harvest energy from the primary signal (and store in its battery). The harvested energy then can be used to transmit data when the primary channel becomes idle. To maximize the throughput for the secondary system, it is critical for the RSU to decide when to backscatter and when to harvest energy. This optimal decision has to account for the dynamics of the primary channel, energy storage capability, and data to be sent. To tackle that problem, we propose a Markov decision process (MDP)-based framework to optimize RSUs decisions based on its current states, e.g., energy, data as well as the primary channel state. As the state information may not be readily available at the RSU, we then design a low-complexity online reinforcement learning algorithm that guides the RSU to find the optimal solution without requiring prior- and complete-information from the environment. The extensive simulation results then clearly show that the proposed solution achieves higher throughputs, i.e., up to 50%, than that of conventional methods.