The design of high-resolution and cross-term (CT) free time-frequency distributions (TFDs) has been an open problem. Classical kernel based methods are limited by the trade-off between TFD resolution and CT suppression, even under optimally derived parameters. To break the current limitation, we propose a data-driven kernel learning model directly based on Wigner-Ville distribution (WVD). The proposed kernel learning based TFD (KL-TFD) model includes several stacked multi-channel learning convolutional kernels. Specifically, a skipping operator is utilized to maintain correct information transmission, and a weighted block is employed to exploit spatial and channel dependencies. These two designs simultaneously achieve high TFD resolution and CT elimination. Numerical experiments on both synthetic and real-world data confirm the superiority of the proposed KL-TFD over traditional kernel function methods.