In this paper, we propose textit{ReLeTA}: Reinforcement Learning based Task Allocation for temperature minimization. We design a new reward function and use a new state model to facilitate optimization of reinforcement learning algorithm. By means of the new reward function and state model, releta is able to effectively reduce the system peak temperature without compromising the application performance. We implement and evaluate releta on a real platform in comparison with the state-of-the-art approaches. Experimental results show releta can reduce the average peak temperature by 4 $^{circ}$C and the maximum difference is up to 13 $^{circ}$C.