The Hierarchical Schur Complement method (HSC), and the HSC-extension, have significantly accelerated the evaluation of the retarded Greens function, particularly the lesser Greens function, for two-dimensional nanoscale devices. In this work, the HSC-extension is applied to determine the solution of non-equilibrium Greens functions (NEGF) on three-dimensional nanoscale devices. The operation count for the HSC-extension is analyzed for a cuboid device. When a cubic device is discretized with $N times N times N$ grid points, the state-of-the-art Recursive Green Function (RGF) algorithm takes $mathcal{O}(N^7)$ operations, whereas the HSC-extension only requires $mathcal{O}(N^6)$ operations. %Realistic operation counts also depend on the system dimensions in $xyz$-directions and the form of contact self-energy matrix. Operation counts and runtimes are also studied for three-dimensional nanoscale devices of practical interest: a graphene-boron- nitride-graphene multilayer system, a silicon nanowire, and a DNA molecule. The numerical experiments indicate that the cost for the HSC-extension is proportional to the solution of one linear system (or one LU-factorization) and that the runtime speed-ups over RGF exceed three orders of magnitude when simulating realistic devices, such as a graphene-boron nitride-graphene multilayer system with 40,000 atoms.