Information Source Detection with Limited Time Knowledge


Abstract in English

This paper investigates the problem of utilizing network topology and partial timestamps to detect the information source in a network. The problem incurs prohibitive cost under canonical maximum likelihood estimation (MLE) of the source due to the exponential number of possible infection paths. Our main idea of source detection, however, is to approximate the MLE by an alternative infection path based estimator, the essence of which is to identify the most likely infection path that is consistent with observed timestamps. The source node associated with that infection path is viewed as the estimated source $hat{v}$. We first study the case of tree topology, where by transforming the infection path based estimator into a linear integer programming, we find a reduced search region that remarkably improves the time efficiency. Within this reduced search region, the estimator $hat{v}$ is provably always on a path which we term as emph{candidate path}. This notion enables us to analyze the distribution of $d(v^{ast},hat{v})$, the error distance between $hat{v}$ and the true source $v^{ast}$, on arbitrary tree, which allows us to obtain for the first time, in the literature provable performance guarantee of the estimator under limited timestamps. Specifically, on the infinite $g$-regular tree with uniform sampled timestamps, we get a refined performance guarantee in the sense of a constant bounded $d(v^{ast},hat{v})$. By virtue of time labeled BFS tree, the estimator still performs fairly well when extended to more general graphs. Experiments on both synthetic and real datasets further demonstrate the superior performance of our proposed algorithms.

Download