Due to the explosive growth in the number of wireless devices and diverse wireless services, such as virtual/augmented reality and Internet-of-Everything, next generation wireless networks face unprecedented challenges caused by heterogeneous data traffic, massive connectivity, and ultra-high bandwidth efficiency and ultra-low latency requirements. To address these challenges, advanced multiple access schemes are expected to be developed, namely next generation multiple access (NGMA), which are capable of supporting massive numbers of users in a more resource- and complexity-efficient manner than existing multiple access schemes. As the research on NGMA is in a very early stage, in this paper, we explore the evolution of NGMA with a particular focus on non-orthogonal multiple access (NOMA), i.e., the transition from NOMA to NGMA. In particular, we first review the fundamental capacity limits of NOMA, elaborate the new requirements for NGMA, and discuss several possible candidate techniques. Moreover, given the high compatibility and flexibility of NOMA, we provide an overview of current research efforts on multi-antenna techniques for NOMA, promising future application scenarios of NOMA, and the interplay between NOMA and other emerging physical layer techniques. Furthermore, we discuss advanced mathematical tools for facilitating the design of NOMA communication systems, including conventional optimization approaches and new machine learning techniques. Next, we propose a unified framework for NGMA based on multiple antennas and NOMA, where both downlink and uplink transmission are considered, thus setting the foundation for this emerging research area. Finally, several practical implementation challenges for NGMA are highlighted as motivation for future work.