No Arabic abstract
Emergences of computers and information technological revolution made tremendous changes in the real world and provides a different dimension for the intelligent data analysis. Well formed fact, the information at right time and at right place deploy a better knowledge.However, the challenge arises when larger volume of inconsistent data is given for decision making and knowledge extraction. To handle such imprecise data certain mathematical tools of greater importance has developed by researches in recent past namely fuzzy set, intuitionistic fuzzy set, rough Set, formal concept analysis and ordering rules. It is also observed that many information system contains numerical attribute values and therefore they are almost similar instead of exact similar. To handle such type of information system, in this paper we use two processes such as pre process and post process. In pre process we use rough set on intuitionistic fuzzy approximation space with ordering rules for finding the knowledge whereas in post process we use formal concept analysis to explore better knowledge and vital factors affecting decisions.
Recently, several large-scale RDF knowledge bases have been built and applied in many knowledge-based applications. To further increase the number of facts in RDF knowledge bases, logic rules can be used to predict new facts based on the existing ones. Therefore, how to automatically learn reliable rules from large-scale knowledge bases becomes increasingly important. In this paper, we propose a novel rule learning approach named RDF2Rules for RDF knowledge bases. RDF2Rules first mines frequent predicate cycles (FPCs), a kind of interesting frequent patterns in knowledge bases, and then generates rules from the mined FPCs. Because each FPC can produce multiple rules, and effective pruning strategy is used in the process of mining FPCs, RDF2Rules works very efficiently. Another advantage of RDF2Rules is that it uses the entity type information when generates and evaluates rules, which makes the learned rules more accurate. Experiments show that our approach outperforms the compared approach in terms of both efficiency and accuracy.
The problem of knowledge graph (KG) reasoning has been widely explored by traditional rule-based systems and more recently by knowledge graph embedding methods. While logical rules can capture deterministic behavior in a KG they are brittle and mining ones that infer facts beyond the known KG is challenging. Probabilistic embedding methods are effective in capturing global soft statistical tendencies and reasoning with them is computationally efficient. While embedding representations learned from rich training data are expressive, incompleteness and sparsity in real-world KGs can impact their effectiveness. We aim to leverage the complementary properties of both methods to develop a hybrid model that learns both high-quality rules and embeddings simultaneously. Our method uses a cross feedback paradigm wherein, an embedding model is used to guide the search of a rule mining system to mine rules and infer new facts. These new facts are sampled and further used to refine the embedding model. Experiments on multiple benchmark datasets show the effectiveness of our method over other competitive standalone and hybrid baselines. We also show its efficacy in a sparse KG setting and finally explore the connection with negative sampling.
Service manual documents are crucial to the engineering company as they provide guidelines and knowledge to service engineers. However, it has become inconvenient and inefficient for service engineers to retrieve specific knowledge from documents due to the complexity of resources. In this research, we propose an automated knowledge mining and document classification system with novel multi-model transfer learning approaches. Particularly, the classification performance of the system has been improved with three effective techniques: fine-tuning, pruning, and multi-model method. The fine-tuning technique optimizes a pre-trained BERT model by adding a feed-forward neural network layer and the pruning technique is used to retrain the BERT model with new data. The multi-model method initializes and trains multiple BERT models to overcome the randomness of data ordering during the fine-tuning process. In the first iteration of the training process, multiple BERT models are being trained simultaneously. The best model is then selected for the next phase of the training process with another two iterations and the training processes for other BERT models will be terminated. The performance of the proposed system has been evaluated by comparing with two robust baseline methods, BERT and BERT-CNN. Experimental results on a widely used Corpus of Linguistic Acceptability (CoLA) dataset have shown that the proposed techniques perform better than these baseline methods in terms of accuracy and MCC score.
This paper proposes a novel solution to spam detection inspired by a model of the adaptive immune system known as the crossregulation model. We report on the testing of a preliminary algorithm on six e-mail corpora. We also compare our results statically and dynamically with those obtained by the Naive Bayes classifier and another binary classification method we developed previously for biomedical text-mining applications. We show that the cross-regulation model is competitive against those and thus promising as a bio-inspired algorithm for spam detection in particular, and binary classification in general.
Multiple web-scale Knowledge Bases, e.g., Freebase, YAGO, NELL, have been constructed using semi-supervised or unsupervised information extraction techniques and many of them, despite their large sizes, are continuously growing. Much research effort has been put into mining inference rules from knowledge bases. To address the task of rule mining over evolving web-scale knowledge bases, we propose a parallel incremental rule mining framework. Our approach is able to efficiently mine rules based on the relational model and apply updates to large knowledge bases; we propose an alternative metric that reduces computation complexity without compromising quality; we apply multiple optimization techniques that reduce runtime by more than 2 orders of magnitude. Experiments show that our approach efficiently scales to web-scale knowledge bases and saves over 90% time compared to the state-of-the-art batch rule mining system. We also apply our optimization techniques to the batch rule mining algorithm, reducing runtime by more than half compared to the state-of-the-art. To the best of our knowledge, our incremental rule mining system is the first that handles updates to web-scale knowledge bases.