In object recognition, Fisher vector (FV) representation is one of the state-of-art image representations ways at the expense of dense, high dimensional features and increased computation time. A simplification of FV is attractive, so we propose Sparse Fisher vector (SFV). By incorporating locality strategy, we can accelerate the Fisher coding step in image categorization which is implemented from a collective of local descriptors. Combining with pooling step, we explore the relationship between coding step and pooling step to give a theoretical explanation about SFV. Experiments on benchmark datasets have shown that SFV leads to a speedup of several-fold of magnitude compares with FV, while maintaining the categorization performance. In addition, we demonstrate how SFV preserves the consistence in representation of similar local features.