ترغب بنشر مسار تعليمي؟ اضغط هنا

Network Classification and Categorization

115   0   0.0 ( 0 )
 نشر من قبل Ryan Rossi
 تاريخ النشر 2017
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

To the best of our knowledge, this paper presents the first large-scale study that tests whether network categories (e.g., social networks vs. web graphs) are distinguishable from one another (using both categories of real-world networks and synthetic graphs). A classification accuracy of $94.2%$ was achieved using a random forest classifier with both real and synthetic networks. This work makes two important findings. First, real-world networks from various domains have distinct structural properties that allow us to predict with high accuracy the category of an arbitrary network. Second, classifying synthetic networks is trivial as our models can easily distinguish between synthetic graphs and the real-world networks they are supposed to model.



قيم البحث

اقرأ أيضاً

A main challenge in mining network-based data is finding effective ways to represent or encode graph structures so that it can be efficiently exploited by machine learning algorithms. Several methods have focused in network representation at node/edg e or substructure level. However, many real life challenges such as time-varying, multilayer, chemical compounds and brain networks involve analysis of a family of graphs instead of single one opening additional challenges in graph comparison and representation. Traditional approaches for learning representations relies on hand-crafting specialized heuristics to extract meaningful information about the graphs, e.g statistical properties, structural features, etc. as well as engineered graph distances to quantify dissimilarity between networks. In this work we provide an unsupervised approach to learn embedding representation for a collection of graphs so that it can be used in numerous graph mining tasks. By using an unsupervised neural network approach on input graphs, we aim to capture the underlying distribution of the data in order to discriminate between different class of networks. Our method is assessed empirically on synthetic and real life datasets and evaluated in three different tasks: graph clustering, visualization and classification. Results reveal that our method outperforms well known graph distances and graph-kernels in clustering and classification tasks, being highly efficient in runtime.
We study the lobby index (l-index for short) as a local node centrality measure for complex networks. The l-inde is compared with degree (a local measure), betweenness and Eigenvector centralities (two global measures) in the case of biological netwo rk (Yeast interaction protein-protein network) and a linguistic network (Moby Thesaurus II). In both networks, the l-index has poor correlation with betweenness but correlates with degree and Eigenvector. Being a local measure, one can take advantage by using the l-index because it carries more information about its neighbors when compared with degree centrality, indeed it requires less time to compute when compared with Eigenvector centrality. Results suggests that l-index produces better results than degree and Eigenvector measures for ranking purposes, becoming suitable as a tool to perform this task.
149 - Hao Peng , Jianxin Li , Qiran Gong 2019
Events are happening in real-world and real-time, which can be planned and organized occasions involving multiple people and objects. Social media platforms publish a lot of text messages containing public events with comprehensive topics. However, m ining social events is challenging due to the heterogeneous event elements in texts and explicit and implicit social network structures. In this paper, we design an event meta-schema to characterize the semantic relatedness of social events and build an event-based heterogeneous information network (HIN) integrating information from external knowledge base, and propose a novel Pair-wise Popularity Graph Convolutional Network (PP-GCN) based fine-grained social event categorization model. We propose a Knowledgeable meta-paths Instances based social Event Similarity (KIES) between events and build a weighted adjacent matrix as input to the PP-GCN model. Comprehensive experiments on real data collections are conducted to compare various social event detection and clustering tasks. Experimental results demonstrate that our proposed framework outperforms other alternative social event categorization techniques.
109 - Massimo Franceschet 2011
We represent collaboration of authors in computer science papers in terms of both affiliation and collaboration networks and observe how these networks evolved over time since 1960. We investigate the temporal evolution of bibliometric properties, li ke size of the discipline, productivity of scholars, and collaboration level in papers, as well as of large-scale network properties, like reachability and average separation distance among scientists, distribution of the number of scholar collaborators, network clustering and network assortativity by number of collaborators.
Author name ambiguity causes inadequacy and inconvenience in academic information retrieval, which raises the necessity of author name disambiguation (AND). Existing AND methods can be divided into two categories: the models focusing on content infor mation to distinguish whether two papers are written by the same author, the models focusing on relation information to represent information as edges on the network and to quantify the similarity among papers. However, the former requires adequate labeled samples and informative negative samples, and are also ineffective in measuring the high-order connections among papers, while the latter needs complicated feature engineering or supervision to construct the network. We propose a novel generative adversarial framework to grow the two categories of models together: (i) the discriminative module distinguishes whether two papers are from the same author, and (ii) the generative module selects possibly homogeneous papers directly from the heterogeneous information network, which eliminates the complicated feature engineering. In such a way, the discriminative module guides the generative module to select homogeneous papers, and the generative module generates high-quality negative samples to train the discriminative module to make it aware of high-order connections among papers. Furthermore, a self-training strategy for the discriminative module and a random walk based generating algorithm are designed to make the training stable and efficient. Extensive experiments on two real-world AND benchmarks demonstrate that our model provides significant performance improvement over the state-of-the-art methods.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا