Many representation systems on the sphere have been proposed in the past, such as spherical harmonics, wavelets, or curvelets. Each of these data representations is designed to extract a specific set of features, and choosing the best fixed representation system for a given scientific application is challenging. In this paper, we show that we can learn directly a representation system from given data on the sphere. We propose two new adaptive approaches: the first is a (potentially multi-scale) patch-based dictionary learning approach, and the second consists in selecting a representation among a parametrized family of representations, the {alpha}-shearlets. We investigate their relative performance to represent and denoise complex structures on different astrophysical data sets on the sphere.