We propose a novel randomized channel sparsifying hybrid precoding (RCSHP) design to reduce the signaling overhead of channel estimation and the hardware cost and power consumption at the base station (BS), in order to fully harvest benefits of frequency division duplex (FDD) massive multiple-input multiple-output (MIMO) systems. RCSHP allows time-sharing among multiple analog precoders, each serving a compatible user group. The analog precoder is adapted to the channel statistics to properly sparsify the channel for the associated user group, such that the resulting effective channel (product of channel and analog precoder) not only has enough spatial degrees of freedom (DoF) to serve this group of users, but also can be accurately estimated under the limited pilot budget. The digital precoder is adapted to the effective channel based on the duality theory to facilitate the power allocation and exploit the spatial multiplexing gain. We formulate the joint optimization of the time-sharing factors and the associated sets of analog precoders and power allocations as a general utility optimization problem, which considers the impact of effective channel estimation error on the system performance. Then we propose an efficient stochastic successive convex approximation algorithm to provably obtain Karush-Kuhn-Tucker (KKT) points of this problem.