A freely walking fly visits roughly 100 stereotyped states in a strongly non-Markovian sequence. To explore these dynamics, we develop a generalization of the information bottleneck method, compressing the large number of behavioral states into a more compact description that maximally preserves the correlations between successive states. Surprisingly, preserving these short time correlations with a compression into just two states captures the long ranged correlations seen in the raw data. Having reduced the behavior to a binary sequence, we describe the distribution of these sequences by an Ising model with pairwise interactions, which is the maximum entropy model that matches the two-point correlations. Matching the correlation function at longer and longer times drives the resulting model toward the Ising model with inverse square interactions and near zero magnetic field. The emergence of this statistical physics problem from the analysis real data on animal behavior is unexpected.