Recent studies have shown that the number counts of convergence peaks N(kappa) in weak lensing (WL) maps, expected from large forthcoming surveys, can be a useful probe of cosmology. We follow up on this finding, and use a suite of WL convergence maps, obtained from ray-tracing N-body simulations, to study (i) the physical origin of WL peaks with different heights, and (ii) whether the peaks contain information beyond the convergence power spectrum P_ell. In agreement with earlier work, we find that high peaks (with amplitudes >~ 3.5 sigma, where sigma is the r.m.s. of the convergence kappa) are typically dominated by a single massive halo. In contrast, medium-height peaks (~0.5-1.5 sigma) cannot be attributed to a single collapsed dark matter halo, and are instead created by the projection of multiple (typically, 4-8) halos along the line of sight, and by random galaxy shape noise. Nevertheless, these peaks dominate the sensitivity to the cosmological parameters w, sigma_8, and Omega_m. We find that the peak height distribution and its dependence on cosmology differ significantly from predictions in a Gaussian random field. We directly compute the marginalized errors on w, sigma_8, and Omega_m from the N(kappa) + P_ell combination, including redshift tomography with source galaxies at z_s=1 and z_s=2. We find that the N(kappa) + P_ell combination has approximately twice the cosmological sensitivity compared to P_ell alone. These results demonstrate that N(kappa) contains non-Gaussian information complementary to the power spectrum.