Opinion Fraud Detection via Neural Autoencoder Decision Forest

68 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Manqing Dong

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Manqing Dong - Lina Yao - Xianzhi Wang

الحساب واللغة الذكاء الاصطناعي التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Online reviews play an important role in influencing buyers daily purchase decisions. However, fake and meaningless reviews, which cannot reflect users genuine purchase experience and opinions, widely exist on the Web and pose great challenges for users to make right choices. Therefore,it is desirable to build a fair model that evaluates the quality of products by distinguishing spamming reviews. We present an end-to-end trainable unified model to leverage the appealing properties from Autoencoder and random forest. A stochastic decision tree model is implemented to guide the global parameter learning process. Extensive experiments were conducted on a large Amazon review dataset. The proposed model consistently outperforms a series of compared methods.

قيم البحث

91 - Manqing Dong , Lina Yao , Xianzhi Wang 2018

Random forest and deep neural network are two schools of effective classification methods in machine learning. While the random forest is robust irrespective of the data domain, the deep neural network has advantages in handling high dimensional data . In view that a differentiable neural decision forest can be added to the neural network to fully exploit the benefits of both models, in our work, we further combine convolutional autoencoder with neural decision forest, where autoencoder has its advantages in finding the hidden representations of the input data. We develop a gradient boost module and embed it into the proposed convolutional autoencoder with neural decision forest to improve the performance. The idea of gradient boost is to learn and use the residual in the prediction. In addition, we design a structure to learn the parameters of the neural decision forest and gradient boost module at contiguous steps. The extensive experiments on several public datasets demonstrate that our proposed model achieves good efficiency and prediction performance compared with a series of baseline methods.

التعلم الآلي التعلم الالي

Are You for Real? Detecting Identity Fraud via Dialogue Interactions

72 - Weikang Wang , Jiajun Zhang , Qian Li 2019

Identity fraud detection is of great importance in many real-world scenarios such as the financial industry. However, few studies addressed this problem before. In this paper, we focus on identity fraud detection in loan applications and propose to s olve this problem with a novel interactive dialogue system which consists of two modules. One is the knowledge graph (KG) constructor organizing the personal information for each loan applicant. The other is structured dialogue management that can dynamically generate a series of questions based on the personal KG to ask the applicants and determine their identity states. We also present a heuristic user simulator based on problem analysis to evaluate our method. Experiments have shown that the trainable dialogue system can effectively detect fraudsters, and achieve higher recognition accuracy compared with rule-based systems. Furthermore, our learned dialogue strategies are interpretable and flexible, which can help promote real-world applications.

الحساب واللغة الذكاء الاصطناعي

Dirichlet Variational Autoencoder for Text Modeling

199 - Yijun Xiao , Tiancheng Zhao , William Yang Wang 2018

We introduce an improved variational autoencoder (VAE) for text modeling with topic information explicitly modeled as a Dirichlet latent variable. By providing the proposed model topic awareness, it is more superior at reconstructing input texts. Fur thermore, due to the inherent interactions between the newly introduced Dirichlet variable and the conventional multivariate Gaussian variable, the model is less prone to KL divergence vanishing. We derive the variational lower bound for the new model and conduct experiments on four different data sets. The results show that the proposed model is superior at text reconstruction across the latent space and classifications on learned representations have higher test accuracies.

الحساب واللغة الذكاء الاصطناعي التعلم الآلي

Explainable Machine Learning for Fraud Detection

78 - Ismini Psychoula , Andreas Gutmann , Pradip Mainali 2021

The application of machine learning to support the processing of large datasets holds promise in many industries, including financial services. However, practical issues for the full adoption of machine learning remain with the focus being on underst anding and being able to explain the decisions and predictions made by complex models. In this paper, we explore explainability methods in the domain of real-time fraud detection by investigating the selection of appropriate background datasets and runtime trade-offs on both supervised and unsupervised models.

التعلم الآلي الذكاء الاصطناعي

Alleviating the Inconsistency Problem of Applying Graph Neural Network to Fraud Detection

87 - Zhiwei Liu , Yingtong Dou , Philip S. Yu 2020

The graph-based model can help to detect suspicious fraud online. Owing to the development of Graph Neural Networks~(GNNs), prior research work has proposed many GNN-based fraud detection frameworks based on either homogeneous graphs or heterogeneous graphs. These work follow the existing GNN framework by aggregating the neighboring information to learn the node embedding, which lays on the assumption that the neighbors share similar context, features, and relations. However, the inconsistency problem is hardly investigated, i.e., the context inconsistency, feature inconsistency, and relation inconsistency. In this paper, we introduce these inconsistencies and design a new GNN framework, $mathsf{GraphConsis}$, to tackle the inconsistency problem: (1) for the context inconsistency, we propose to combine the context embeddings with node features, (2) for the feature inconsistency, we design a consistency score to filter the inconsistent neighbors and generate corresponding sampling probability, and (3) for the relation inconsistency, we learn a relation attention weights associated with the sampled nodes. Empirical analysis on four datasets indicates the inconsistency problem is crucial in a fraud detection task. The extensive experiments prove the effectiveness of $mathsf{GraphConsis}$. We also released a GNN-based fraud detection toolbox with implementations of SOTA models. The code is available at https://github.com/safe-graph/DGFraud.

الشبكات الاجتماعية والمعلومات استرجاع المعلومات التعلم الآلي