User dissatisfaction due to buffering pauses during streaming is a significant cost to the system, which we model as a non-decreasing function of the frequency of buffering pause. Minimization of total user dissatisfaction in a multi-channel cellular network leads to a non-convex problem. Utilizing a combinatorial structure in this problem, we first propose a polynomial time joint admission control and channel allocation algorithm which is provably (almost) optimal. This scheme assumes that the base station (BS) knows the frame statistics of the streams. In a more practical setting, where these statistics are not available a priori at the BS, a learning based scheme with provable guarantees is developed. This learning based scheme has relation to regret minimization in multi-armed bandits with non-i.i.d. and delayed reward (cost). All these algorithms require none to minimal feedback from the user equipment to the base station regarding the states of the media player buffer at the application layer, and hence, are of practical interest.