We show variants of spectral sparsification routines can preserve the total spanning tree counts of graphs, which by Kirchhoffs matrix-tree theorem, is equivalent to determinant of a graph Laplacian minor, or equivalently, of any SDDM matrix. Our analyses utilizes this combinatorial connection to bridge between statistical leverage scores / effective resistances and the analysis of random graphs by [Janson, Combinatorics, Probability and Computing `94]. This leads to a routine that in quadratic time, sparsifies a graph down to about $n^{1.5}$ edges in ways that preserve both the determinant and the distribution of spanning trees (provided the sparsified graph is viewed as a random object). Extending this algorithm to work with Schur complements and approximate Choleksy factorizations leads to algorithms for counting and sampling spanning trees which are nearly optimal for dense graphs. We give an algorithm that computes a $(1 pm delta)$ approximation to the determinant of any SDDM matrix with constant probability in about $n^2 delta^{-2}$ time. This is the first routine for graphs that outperforms general-purpose routines for computing determinants of arbitrary matrices. We also give an algorithm that generates in about $n^2 delta^{-2}$ time a spanning tree of a weighted undirected graph from a distribution with total variation distance of $delta$ from the $w$-uniform distribution .