The twin-width of a graph $G$ is the minimum integer $d$ such that $G$ has a $d$-contraction sequence, that is, a sequence of $|V(G)|-1$ iterated vertex identifications for which the overall maximum number of red edges incident to a single vertex is at most $d$, where a red edge appears between two sets of identified vertices if they are not homogeneous in $G$. We show that if a graph admits a $d$-contraction sequence, then it also has a linear-arity tree of $f(d)$-contractions, for some function $f$. First this permits to show that every bounded twin-width class is small, i.e., has at most $n!c^n$ graphs labeled by $[n]$, for some constant $c$. This unifies and extends the same result for bounded treewidth graphs [Beineke and Pippert, JCT 69], proper subclasses of permutations graphs [Marcus and Tardos, JCTA 04], and proper minor-free classes [Norine et al., JCTB 06]. The second consequence is an $O(log n)$-adjacency labeling scheme for bounded twin-width graphs, confirming several cases of the implicit graph conjecture. We then explore the small conjecture that, conversely, every small hereditary class has bounded twin-width. Inspired by sorting networks of logarithmic depth, we show that $log_{Theta(log log d)}n$-subdivisions of $K_n$ (a small class when $d$ is constant) have twin-width at most $d$. We obtain a rather sharp converse with a surprisingly direct proof: the $log_{d+1}n$-subdivision of $K_n$ has twin-width at least $d$. Secondly graphs with bounded stack or queue number (also small classes) have bounded twin-width. Thirdly we show that cubic expanders obtained by iterated random 2-lifts from $K_4$~[Bilu and Linial, Combinatorica 06] have bounded twin-width, too. We suggest a promising connection between the small conjecture and group theory. Finally we define a robust notion of sparse twin-width and discuss how it compares with other sparse classes.