We present a minimally-invasive endoscope based on a multimode fiber that combines photoacoustic and fluorescence sensing. From the measurement of a transmission matrix during a prior calibration step, a focused spot is produced and raster-scanned over a sample at the distal tip of the fiber by use of a fast spatial light modulator. An ultra-sensitive fiber-optic ultrasound sensor for photoacoustic detection placed next to the fiber is combined with a photodetector to obtain both fluorescence and photoacoustic images with a distal imaging tip no larger than 250um. The high signal-to-noise ratio provided by wavefront shaping based focusing and the ultra-sensitive ultrasound sensor enables imaging with a single laser shot per pixel, demonstrating fast two-dimensional hybrid imaging of red blood cells and fluorescent beads.