GenCAT: Generating Attributed Graphs with Controlled Relationships between Classes, Attributes, and Topology


Abstract in English

Generating large synthetic attributed graphs with node labels is an important task to support various experimental studies for graph analysis methods. Existing graph generators fail to simultaneously simulate the relationships between labels, attributes, and topology which real-world graphs exhibit. Motivated by this limitation, we propose GenCAT, an attributed graph generator for controlling those relationships, which has the following advantages. (i) GenCAT generates graphs with user-specified node degrees and flexibly controls the relationship between nodes and labels by incorporating the connection proportion for each node to classes. (ii) Generated attribute values follow user-specified distributions, and users can flexibly control the correlation between the attributes and labels. (iii) Graph generation scales linearly to the number of edges. GenCAT is the first generator to support all three of these practical features. Through extensive experiments, we demonstrate that GenCAT can efficiently generate high-quality complex attributed graphs with user-controlled relationships between labels, attributes, and topology.

Download