Graph sparsification has been studied extensively over the past two decades, culminating in spectral sparsifiers of optimal size (up to constant factors). Spectral hypergraph sparsification is a natural analogue of this problem, for which optimal bounds on the sparsifier size are not known, mainly because the hypergraph Laplacian is non-linear, and thus lacks the linear-algebraic structure and tools that have been so effective for graphs. Our main contribution is the first algorithm for constructing $epsilon$-spectral sparsifiers for hypergraphs with $O^*(n)$ hyperedges, where $O^*$ suppresses $(epsilon^{-1} log n)^{O(1)}$ factors. This bound is independent of the rank $r$ (maximum cardinality of a hyperedge), and is essentially best possible due to a recent bit complexity lower bound of $Omega(nr)$ for hypergraph sparsification. This result is obtained by introducing two new tools. First, we give a new proof of spectral concentration bounds for sparsifiers of graphs; it avoids linear-algebraic methods, replacing e.g.~the usual application of the matrix Bernstein inequality and therefore applies to the (non-linear) hypergraph setting. To achieve the result, we design a new sequence of hypergraph-dependent $epsilon$-nets on the unit sphere in $mathbb{R}^n$. Second, we extend the weight assignment technique of Chen, Khanna and Nagda [FOCS20] to the spectral sparsification setting. Surprisingly, the number of spanning trees after the weight assignment can serve as a potential function guiding the reweighting process in the spectral setting.