Before quantum error correction (QEC) is achieved, quantum computers focus on noisy intermediate-scale quantum (NISQ) applications. Compared to the well-known quantum algorithms requiring QEC, like Shors or Grovers algorithm, NISQ applications have different structures and properties to exploit in compilation. A key step in compilation is mapping the qubits in the program to physical qubits on a given quantum computer, which has been shown to be an NP-hard problem. In this paper, we present OLSQ-GA, an optimal qubit mapper with a key feature of simultaneous SWAP gate absorption during qubit mapping, which we show to be a very effective optimization technique for NISQ applications. For the class of quantum approximate optimization algorithm (QAOA), an important NISQ application, OLSQ-GA reduces depth by up to 50.0% and SWAP count by 100% compared to other state-of-the-art methods, which translates to 55.9% fidelity improvement. The solution optimality of OLSQ-GA is achieved by the exact SMT formulation. For better scalability, we augment our approach with additional constraints in the form of initial mapping or alternating matching, which speeds up OLSQ-GA by up to 272X with no or little loss of optimality.