Accurately Modeling Biased Random Walks on Weighted Graphs Using $textit{Node2vec+}$


Abstract in English

Node embedding is a powerful approach for representing the structural role of each node in a graph. $textit{Node2vec}$ is a widely used method for node embedding that works by exploring the local neighborhoods via biased random walks on the graph. However, $textit{node2vec}$ does not consider edge weights when computing walk biases. This intrinsic limitation prevents $textit{node2vec}$ from leveraging all the information in weighted graphs and, in turn, limits its application to many real-world networks that are weighted and dense. Here, we naturally extend $textit{node2vec}$ to $textit{node2vec+}$ in a way that accounts for edge weights when calculating walk biases, but which reduces to $textit{node2vec}$ in the cases of unweighted graphs or unbiased walks. We empirically show that $textit{node2vec+}$ is more robust to additive noise than $textit{node2vec}$ in weighted graphs using two synthetic datasets. We also demonstrate that $textit{node2vec+}$ significantly outperforms $textit{node2vec}$ on a commonly benchmarked multi-label dataset (Wikipedia). Furthermore, we test $textit{node2vec+}$ against GCN and GraphSAGE using various challenging gene classification tasks on two protein-protein interaction networks. Despite some clear advantages of GCN and GraphSAGE, they show comparable performance with $textit{node2vec+}$. Finally, $textit{node2vec+}$ can be used as a general approach for generating biased random walks, benefiting all existing methods built on top of $textit{node2vec}$. $textit{Node2vec+}$ is implemented as part of $texttt{PecanPy}$, which is available at https://github.com/krishnanlab/PecanPy .

Download