No Arabic abstract
Recently, Graph Convolutional Networks (GCNs) have proven to be a powerful mean for Computer Aided Diagnosis (CADx). This approach requires building a population graph to aggregate structural information, where the graph adjacency matrix represents the relationship between nodes. Until now, this adjacency matrix is usually defined manually based on phenotypic information. In this paper, we propose an encoder that automatically selects the appropriate phenotypic measures according to their spatial distribution, and uses the text similarity awareness mechanism to calculate the edge weights between nodes. The encoder can automatically construct the population graph using phenotypic measures which have a positive impact on the final results, and further realizes the fusion of multimodal information. In addition, a novel graph convolution network architecture using multi-layer aggregation mechanism is proposed. The structure can obtain deep structure information while suppressing over-smooth, and increase the similarity between the same type of nodes. Experimental results on two databases show that our method can significantly improve the diagnostic accuracy for Autism spectrum disorder and breast cancer, indicating its universality in leveraging multimodal data for disease prediction.
Graph Convolutional Networks (GCNs) have gained great popularity in tackling various analytics tasks on graph and network data. However, some recent studies raise concerns about whether GCNs can optimally integrate node features and topological structures in a complex graph with rich information. In this paper, we first present an experimental investigation. Surprisingly, our experimental results clearly show that the capability of the state-of-the-art GCNs in fusing node features and topological structures is distant from optimal or even satisfactory. The weakness may severely hinder the capability of GCNs in some classification tasks, since GCNs may not be able to adaptively learn some deep correlation information between topological structures and node features. Can we remedy the weakness and design a new type of GCNs that can retain the advantages of the state-of-the-art GCNs and, at the same time, enhance the capability of fusing topological structures and node features substantially? We tackle the challenge and propose an adaptive multi-channel graph convolutional networks for semi-supervised classification (AM-GCN). The central idea is that we extract the specific and common embeddings from node features, topological structures, and their combinations simultaneously, and use the attention mechanism to learn adaptive importance weights of the embeddings. Our extensive experiments on benchmark data sets clearly show that AM-GCN extracts the most correlated information from both node features and topological structures substantially, and improves the classification accuracy with a clear margin.
Stock price movement prediction is commonly accepted as a very challenging task due to the volatile nature of financial markets. Previous works typically predict the stock price mainly based on its own information, neglecting the cross effect among involved stocks. However, it is well known that an individual stock price is correlated with prices of other stocks in complex ways. To take the cross effect into consideration, we propose a deep learning framework, called Multi-GCGRU, which comprises graph convolutional network (GCN) and gated recurrent unit (GRU) to predict stock movement. Specifically, we first encode multiple relationships among stocks into graphs based on financial domain knowledge and utilize GCN to extract the cross effect based on these pre-defined graphs. To further get rid of prior knowledge, we explore an adaptive relationship learned by data automatically. The cross-correlation features produced by GCN are concatenated with historical records and then fed into GRU to model the temporal dependency of stock prices. Experiments on two stock indexes in China market show that our model outperforms other baselines. Note that our model is rather feasible to incorporate more effective stock relationships containing expert knowledge, as well as learn data-driven relationship.
Existing few-shot learning (FSL) methods assume that there exist sufficient training samples from source classes for knowledge transfer to target classes with few training samples. However, this assumption is often invalid, especially when it comes to fine-grained recognition. In this work, we define a new FSL setting termed few-shot fewshot learning (FSFSL), under which both the source and target classes have limited training samples. To overcome the source class data scarcity problem, a natural option is to crawl images from the web with class names as search keywords. However, the crawled images are inevitably corrupted by large amount of noise (irrelevant images) and thus may harm the performance. To address this problem, we propose a graph convolutional network (GCN)-based label denoising (LDN) method to remove the irrelevant images. Further, with the cleaned web images as well as the original clean training images, we propose a GCN-based FSL method. For both the LDN and FSL tasks, a novel adaptive aggregation GCN (AdarGCN) model is proposed, which differs from existing GCN models in that adaptive aggregation is performed based on a multi-head multi-level aggregation module. With AdarGCN, how much and how far information carried by each graph node is propagated in the graph structure can be determined automatically, therefore alleviating the effects of both noisy and outlying training samples. Extensive experiments show the superior performance of our AdarGCN under both the new FSFSL and the conventional FSL settings.
Face identification/recognition has significantly advanced over the past years. However, most of the proposed approaches rely on static RGB frames and on neutral facial expressions. This has two disadvantages. First, important facial shape cues are ignored. Second, facial deformations due to expressions can have an impact on the performance of such a method. In this paper, we propose a novel framework for dynamic 3D face identification/recognition based on facial keypoints. Each dynamic sequence of facial expressions is represented as a spatio-temporal graph, which is constructed using 3D facial landmarks. Each graph node contains local shape and texture features that are extracted from its neighborhood. For the classification/identification of faces, a Spatio-temporal Graph Convolutional Network (ST-GCN) is used. Finally, we evaluate our approach on a challenging dynamic 3D facial expression dataset.
Deep learning systems have been successfully applied to Euclidean data such as images, video, and audio. In many applications, however, information and their relationships are better expressed with graphs. Graph Convolutional Networks (GCNs) appear to be a promising approach to efficiently learn from graph data structures, having shown advantages in many critical applications. As with other deep learning modalities, hardware acceleration is critical. The challenge is that real-world graphs are often extremely large and unbalanced; this poses significant performance demands and design challenges. In this paper, we propose Autotuning-Workload-Balancing GCN (AWB-GCN) to accelerate GCN inference. To address the issue of workload imbalance in processing real-world graphs, three hardware-based autotuning techniques are proposed: dynamic distribution smoothing, remote switching, and row remapping. In particular, AWB-GCN continuously monitors the sparse graph pattern, dynamically adjusts the workload distribution among a large number of processing elements (up to 4K PEs), and, after converging, reuses the ideal configuration. Evaluation is performed using an Intel D5005 FPGA with five commonly-used datasets. Results show that 4K-PE AWB-GCN can significantly elevate PE utilization by 7.7x on average and demonstrate considerable performance speedups over CPUs (3255x), GPUs (80.3x), and a prior GCN accelerator (5.1x).