This paper introduces a machine learning based collaborative multi-band spectrum sensing policy for cognitive radios. The proposed sensing policy guides secondary users to focus the search of unused radio spectrum to those frequencies that persistently provide them high data rate. The proposed policy is based on machine learning, which makes it adaptive with the temporally and spatially varying radio spectrum. Furthermore, there is no need for dynamic modeling of the primary activity since it is implicitly learned over time. Energy efficiency is achieved by minimizing the number of assigned sensors per each subband under a constraint on miss detection probability. It is important to control the missed detections because they cause collisions with primary transmissions and lead to retransmissions at both the primary and secondary user. Simulations show that the proposed machine learning based sensing policy improves the overall throughput of the secondary network and improves the energy efficiency while controlling the miss detection probability.