BLISlab: A Sandbox for Optimizing GEMM


Abstract in English

Matrix-matrix multiplication is a fundamental operation of great importance to scientific computing and, increasingly, machine learning. It is a simple enough concept to be introduced in a typical high school algebra course yet in practice important enough that its implementation on computers continues to be an active research topic. This note describes a set of exercises that use this operation to illustrate how high performance can be attained on modern CPUs with hierarchical memories (multiple caches). It does so by building on the insights that underly the BLAS-like Library Instantiation Software (BLIS) framework by exposing a simplified sandbox that mimics the implementation in BLIS. As such, it also becomes a vehicle for the crowd sourcing of the optimization of BLIS. We call this set of exercises BLISlab.

Download