We study a natural question about sparse random matrices which arises from an application in distributed computing: what is the distance from a fixed vector to the column span of a sparse random matrix $A in R^{n times m}$? We answer this question for several ensembles of sparse random matrices in which the average number of non-zero entries per column, $d$, is smaller than $log(n)$. Key to our analysis is a new characterization of linear dependencies in sparse random matrices. We show that with high probability, in certain random matrices, including rectangular matrices with i.i.d.~Bernoulli entries and $m geq (1 + epsilon)n$, and symmetric random matrices with Bernoulli entries, any linear dependency must be caused by one of three specific combinatorial structures. We show applications of our result to analyzing and designing em gradient codesem, replication schemes used in distributed machine learning to mitigate the effect of slow machines, called em stragglersem. We give the first known construction for a gradient code that achieves near-optimal error for both random and adversarial choices of stragglers.