Reconfigurable intelligent surface (RIS) is envisioned to be a promising green technology to reduce the energy consumption and improve the coverage and spectral efficiency of massive multiple-input multiple-output (MIMO) wireless networks. In a RIS-aided MIMO system, the acquisition of channel state information (CSI) is important for achieving passive beamforming gains of the RIS, but is also challenging due to the cascaded property of the transmitter-RIS-receiver channel and the lack of signal processing capability of the passive RIS elements. The state-of-the-art approach for CSI acquisition in such a system is a pure training-based strategy that depends on a long sequence of pilot symbols. In this paper, we investigate semi-blind cascaded channel estimation for RIS-aided massive MIMO systems, in which the receiver simultaneously estimates the channel coefficients and the partially unknown transmit signal with a small number of pilot sequences. Specifically, we formulate the semi-blind cascaded channel estimation as a trilinear matrix factorization task. Under the Bayesian inference framework, we develop a computationally efficient iterative algorithm using the approximate message passing principle to resolve the trilinear inference problem. Meanwhile, we present an analytical framework to characterize the theoretical performance bound of the proposed approach in the large-system limit via the replica method developed in statistical physics. Extensive simulation results demonstrate the effectiveness of the proposed semi-blind cascaded channel estimation algorithm.