ترغب بنشر مسار تعليمي؟ اضغط هنا

Anti-Money Laundering in Bitcoin: Experimenting with Graph Convolutional Networks for Financial Forensics

123   0   0.0 ( 0 )
 نشر من قبل Mark Weber
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Anti-money laundering (AML) regulations play a critical role in safeguarding financial systems, but bear high costs for institutions and drive financial exclusion for those on the socioeconomic and international margins. The advent of cryptocurrency has introduced an intriguing paradox: pseudonymity allows criminals to hide in plain sight, but open data gives more power to investigators and enables the crowdsourcing of forensic analysis. Meanwhile advances in learning algorithms show great promise for the AML toolkit. In this workshop tutorial, we motivate the opportunity to reconcile the cause of safety with that of financial inclusion. We contribute the Elliptic Data Set, a time series graph of over 200K Bitcoin transactions (nodes), 234K directed payment flows (edges), and 166 node features, including ones based on non-public data; to our knowledge, this is the largest labelled transaction data set publicly available in any cryptocurrency. We share results from a binary classification task predicting illicit transactions using variations of Logistic Regression (LR), Random Forest (RF), Multilayer Perceptrons (MLP), and Graph Convolutional Networks (GCN), with GCN being of special interest as an emergent new method for capturing relational information. The results show the superiority of Random Forest (RF), but also invite algorithmic work to combine the respective powers of RF and graph methods. Lastly, we consider visualization for analysis and explainability, which is difficult given the size and dynamism of real-world transaction graphs, and we offer a simple prototype capable of navigating the graph and observing model performance on illicit activity over time. With this tutorial and data set, we hope to a) invite feedback in support of our ongoing inquiry, and b) inspire others to work on this societally important challenge.



قيم البحث

اقرأ أيضاً

Organized crime inflicts human suffering on a genocidal scale: the Mexican drug cartels have murdered 150,000 people since 2006, upwards of 700,000 people per year are exported in a human trafficking industry enslaving an estimated 40 million people. These nefarious industries rely on sophisticated money laundering schemes to operate. Despite tremendous resources dedicated to anti-money laundering (AML) only a tiny fraction of illicit activity is prevented. The research community can help. In this brief paper, we map the structural and behavioral dynamics driving the technical challenge. We review AML methods, current and emergent. We provide a first look at scalable graph convolutional neural networks for forensic analysis of financial data, which is massive, dense, and dynamic. We report preliminary experimental results using a large synthetic graph (1M nodes, 9M edges) generated by a data simulator we created called AMLSim. We consider opportunities for high performance efficiency, in terms of computation and memory, and we share results from a simple graph compression experiment. Our results support our working hypothesis that graph deep learning for AML bears great promise in the fight against criminal financial activity.
97 - En-Yu Yu , Yan Fu , Jun-Lin Zhou 2021
Many real-world systems can be expressed in temporal networks with nodes playing far different roles in structure and function and edges representing the relationships between nodes. Identifying critical nodes can help us control the spread of public opinions or epidemics, predict leading figures in academia, conduct advertisements for various commodities, and so on. However, it is rather difficult to identify critical nodes because the network structure changes over time in temporal networks. In this paper, considering the sequence topological information of temporal networks, a novel and effective learning framework based on the combination of special GCNs and RNNs is proposed to identify nodes with the best spreading ability. The effectiveness of the approach is evaluated by weighted Susceptible-Infected-Recovered model. Experimental results on four real-world temporal networks demonstrate that the proposed method outperforms both traditional and deep learning benchmark methods in terms of the Kendall $tau$ coefficient and top $k$ hit rate.
Graph neural networks (GNN) has been demonstrated to be effective in classifying graph structures. To further improve the graph representation learning ability, hierarchical GNN has been explored. It leverages the differentiable pooling to cluster no des into fixed groups, and generates a coarse-grained structure accompanied with the shrinking of the original graph. However, such clustering would discard some graph information and achieve the suboptimal results. It is because the node inherently has different characteristics or roles, and two non-isomorphic graphs may have the same coarse-grained structure that cannot be distinguished after pooling. To compensate the loss caused by coarse-grained clustering and further advance GNN, we propose a multi-channel graph convolutional networks (MuchGCN). It is motivated by the convolutional neural networks, at which a series of channels are encoded to preserve the comprehensive characteristics of the input image. Thus, we define the specific graph convolutions to learn a series of graph channels at each layer, and pool graphs iteratively to encode the hierarchical structures. Experiments have been carefully carried out to demonstrate the superiority of MuchGCN over the state-of-the-art graph classification algorithms.
149 - Hao Peng , Jianxin Li , Qiran Gong 2019
Events are happening in real-world and real-time, which can be planned and organized occasions involving multiple people and objects. Social media platforms publish a lot of text messages containing public events with comprehensive topics. However, m ining social events is challenging due to the heterogeneous event elements in texts and explicit and implicit social network structures. In this paper, we design an event meta-schema to characterize the semantic relatedness of social events and build an event-based heterogeneous information network (HIN) integrating information from external knowledge base, and propose a novel Pair-wise Popularity Graph Convolutional Network (PP-GCN) based fine-grained social event categorization model. We propose a Knowledgeable meta-paths Instances based social Event Similarity (KIES) between events and build a weighted adjacent matrix as input to the PP-GCN model. Comprehensive experiments on real data collections are conducted to compare various social event detection and clustering tasks. Experimental results demonstrate that our proposed framework outperforms other alternative social event categorization techniques.
Twitter users operated by automated programs, also known as bots, have increased their appearance recently and induced undesirable social effects. While extensive research efforts have been devoted to the task of Twitter bot detection, previous metho ds leverage only a small fraction of user semantic and profile information, which leads to their failure in identifying bots that exploit multi-modal user information to disguise as genuine users. Apart from that, the state-of-the-art bot detectors fail to leverage user follow relationships and the graph structure it forms. As a result, these methods fall short of capturing new generations of Twitter bots that act in groups and seem genuine individually. To address these two challenges of Twitter bot detection, we propose BotRGCN, which is short for Bot detection with Relational Graph Convolutional Networks. BotRGCN addresses the challenge of community by constructing a heterogeneous graph from follow relationships and apply relational graph convolutional networks to the Twittersphere. Apart from that, BotRGCN makes use of multi-modal user semantic and property information to avoid feature engineering and augment its ability to capture bots with diversified disguise. Extensive experiments demonstrate that BotRGCN outperforms competitive baselines on a comprehensive benchmark TwiBot-20 which provides follow relationships. BotRGCN is also proved to effectively leverage three modals of user information, namely semantic, property and neighborhood information, to boost bot detection performance.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا