Triangular clustering in document networks


الملخص بالإنكليزية

Document networks are characteristic in that a document node, e.g. a webpage or an article, carries meaningful content. Properties of document networks are not only affected by topological connectivity between nodes, but also strongly influenced by the semantic relation between content of the nodes. We observe that document networks have a large number of triangles and a high value of clustering coefficient. And there is a strong correlation between the probability of formation of a triangle and the content similarity among the three nodes involved. We propose the degree-similarity product (DSP) model which well reproduces these properties. The model achieves this by using a preferential attachment mechanism which favours the linkage between nodes that are both popular and similar. This work is a step forward towards a better understanding of the structure and evolution of document networks.

تحميل البحث