The use of kernel functions is a common technique to extract important features from data sets. A quantum computer can be used to estimate kernel entries as transition amplitudes of unitary circuits. It can be shown quantum kernels exist that, subject to computational hardness assumptions, can not be computed classically. It is an important challenge to find quantum kernels that provide an advantage in the classification of real world data. Here we introduce a class of quantum kernels that are related to covariant quantum measurements and can be used for data that has a group structure. The kernel is defined in terms of a single fiducial state that can be optimized by using a technique called kernel alignment. Quantum kernel alignment optimizes the kernel family to minimize the upper bound on the generalisation error for a given data set. We apply this general method to a specific learning problem we refer to as labeling cosets with error and implement the learning algorithm on $27$ qubits of a superconducting processor.