The information content of symbolic sequences (such as nucleic- or amino acid sequences, but also neuronal firings or strings of letters) can be calculated from an ensemble of such sequences, but because information cannot be assigned to single sequences, we cannot correlate information to other observables attached to the sequence. Here we show that an information score obtained from multivariate (multiple-variable) correlations within sequences of a training ensemble can be used to predict observables of out-of-sample sequences with an accuracy that scales with the complexity of correlations, showing that functional information emerges from a hierarchy of multi-variable correlations.