Given a capacitated undirected graph $G=(V,E)$ with a set of terminals $K subset V$, a mimicking network is a smaller graph $H=(V_H,E_H)$ that exactly preserves all the minimum cuts between the terminals. Specifically, the vertex set of the sparsifier $V_H$ contains the set of terminals $K$ and for every bipartition $U, K-U $ of the terminals $K$, the size of the minimum cut separating $U$ from $K-U$ in $G$ is exactly equal to the size of the minimum cut separating $U$ from $K-U$ in $H$. This notion of a mimicking network was introduced by Hagerup, Katajainen, Nishimura and Ragde (1995) who also exhibited a mimicking network of size $2^{2^{k}}$ for every graph with $k$ terminals. The best known lower bound on the size of a mimicking network is linear in the number of terminals. More precisely, the best known lower bound is $k+1$ for graphs with $k$ terminals (Chaudhuri et al. 2000). In this work, we improve both the upper and lower bounds reducing the doubly-exponential gap between them to a single-exponential gap. Specifically, we obtain the following upper and lower bounds on mimicking networks: 1) Given a graph $G$, we exhibit a construction of mimicking network with at most $(|K|-1)$th Dedekind number ($approx 2^{{(k-1)} choose {lfloor {{(k-1)}/2} rfloor}}$) of vertices (independent of size of $V$). Furthermore, we show that the construction is optimal among all {it restricted mimicking networks} -- a natural class of mimicking networks that are obtained by clustering vertices together. 2) There exists graphs with $k$ terminals that have no mimicking network of size smaller than $2^{frac{k-1}{2}}$. We also exhibit improved constructions of mimicking networks for trees and graphs of bounded tree-width.