In an increasingly interconnected world, understanding and summarizing the structure of these networks becomes increasingly relevant. However, this task is nontrivial; proposed summary statistics are as diverse as the networks they describe, and a standardized hierarchy has not yet been established. In contrast, vector-valued random variables admit such a description in terms of their cumulants (e.g., mean, (co)variance, skew, kurtosis). Here, we introduce the natural analogue of cumulants for networks, building a hierarchical description based on correlations between an increasing number of connections, seamlessly incorporating additional information, such as directed edges, node attributes, and edge weights. These graph cumulants provide a principled and unifying framework for quantifying the propensity of a network to display any substructure of interest (such as cliques to measure clustering). Moreover, they give rise to a natural hierarchical family of maximum entropy models for networks (i.e., ERGMs) that do not suffer from the degeneracy problem, a common practical pitfall of other ERGMs.