An introduction to computational algebraic statistics


Abstract in English

In this paper, we introduce the fundamental notion of a Markov basis, which is one of the first connections between commutative algebra and statistics. The notion of a Markov basis is first introduced by Diaconis and Sturmfels (1998) for conditional testing problems on contingency tables by Markov chain Monte Carlo methods. In this method, we make use of a connected Markov chain over the given conditional sample space to estimate the P-values numerically for various conditional tests. A Markov basis plays an importance role in this arguments, because it guarantees the connectivity of the chain, which is needed for unbiasedness of the estimate, for arbitrary conditional sample space. As another important point, a Markov basis is characterized as generators of the well-specified toric ideals of polynomial rings. This connection between commutative algebra and statistics is the main result of Diaconis and Sturmfels (1998). After this first paper, a Markov basis is studied intensively by many researchers both in commutative algebra and statistics, which yields an attractive field called computational algebraic statistics. In this paper, we give a review of the Markov chain Monte Carlo methods for contingency tables and Markov bases, with some fundamental examples. We also give some computational examples by algebraic software Macaulay2 and statistical software R. Readers can also find theoretical details of the problems considered in this paper and various results on the structure and examples of Markov bases in Aoki, Hara and Takemura (2012).

Download