Long memory or long range dependency is an important phenomenon that may arise in the analysis of time series or spatial data. Most of the definitions of long memory of a stationary process $X={X_1, X_2,cdots,}$ are based on the second-order properties of the process. The excess entropy of a stationary process is the summation of redundancies which relates to the rate of convergence of the conditional entropy $H(X_n|X_{n-1},cdots, X_1)$ to the entropy rate. It is proved that the excess entropy is identical to the mutual information between the past and the future when the entropy $H(X_1)$ is finite. We suggest the definition that a stationary process is long memory if the excess entropy is infinite. Since the definition of excess entropy of a stationary process requires very weak moment condition on the distribution of the process, it can be applied to processes whose distributions without bounded second moment. A significant property of excess entropy is that it is invariant under invertible transformation, which enables us to know the excess entropy of a stationary process from the excess entropy of other process. For stationary Guassian process, the excess entropy characterization of long memory relates to popular characterization well. It is proved that the excess entropy of fractional Gaussian noise is infinite if the Hurst parameter $H in (1/2, 1)$.