ﻻ يوجد ملخص باللغة العربية
Sampling is a widely used graph reduction technique to accelerate graph computations and simplify graph visualizations. By comprehensively analyzing the literature on graph sampling, we assume that existing algorithms cannot effectively preserve minority structures that are rare and small in a graph but are very important in graph analysis. In this work, we initially conduct a pilot user study to investigate representative minority structures that are most appealing to human viewers. We then perform an experimental study to evaluate the performance of existing graph sampling algorithms regarding minority structure preservation. Results confirm our assumption and suggest key points for designing a new graph sampling approach named mino-centric graph sampling (MCGS). In this approach, a triangle-based algorithm and a cut-point-based algorithm are proposed to efficiently identify minority structures. A set of importance assessment criteria are designed to guide the preservation of important minority structures. Three optimization objectives are introduced into a greedy strategy to balance the preservation between minority and majority structures and suppress the generation of new minority structures. A series of experiments and case studies are conducted to evaluate the effectiveness of the proposed MCGS.
In the era of big data, graph sampling is indispensable in many settings. Existing sampling methods are mostly designed for static graphs, and aim to preserve basic structural properties of the original graph (such as degree distribution, clustering
Graphs are naturally used to describe the structures of various real-world systems in biology, society, computer science etc., where subgraphs or motifs as basic blocks play an important role in function expression and information processing. However
Synthetic Minority Over-sampling Technique (SMOTE) is the most popular over-sampling method. However, its random nature makes the synthesized data and even imbalanced classification results unstable. It means that in case of running SMOTE n different
Arising user-centric graph applications such as route planning and personalized social network analysis have initiated a shift of paradigms in modern graph processing systems towards multi-query analysis, i.e., processing multiple graph queries in pa
We introduce GraSPy, a Python library devoted to statistical inference, machine learning, and visualization of random graphs and graph populations. This package provides flexible and easy-to-use algorithms for analyzing and understanding graphs with