No Arabic abstract
Distant supervision has been widely used for relation extraction but suffers from noise labeling problem. Neural network models are proposed to denoise with attention mechanism but cannot eliminate noisy data due to its non-zero weights. Hard decision is proposed to remove wrongly-labeled instances from the positive set though causes loss of useful information contained in removed instances. In this paper, we propose a novel generative neural framework named RDSGAN (Rank-based Distant Supervision GAN) which automatically generates valid instances for distant supervision relation extraction. Our framework combines soft attention and hard decision to learn the distribution of true positive instances via adversarial training and selects valid instances conforming to the distribution via rank-based distant supervision, which addresses the false positive problem. Experimental results show the superiority of our framework over strong baselines.
The growing demand for structured knowledge has led to great interest in relation extraction, especially in cases with limited supervision. However, existing distance supervision approaches only extract relations expressed in single sentences. In general, cross-sentence relation extraction is under-explored, even in the supervised-learning setting. In this paper, we propose the first approach for applying distant supervision to cross- sentence relation extraction. At the core of our approach is a graph representation that can incorporate both standard dependencies and discourse relations, thus providing a unifying way to model relations within and across sentences. We extract features from multiple paths in this graph, increasing accuracy and robustness when confronted with linguistic variation and analysis error. Experiments on an important extraction task for precision medicine show that our approach can learn an accurate cross-sentence extractor, using only a small existing knowledge base and unlabeled text from biomedical research articles. Compared to the existing distant supervision paradigm, our approach extracted twice as many relations at similar precision, thus demonstrating the prevalence of cross-sentence relations and the promise of our approach.
In relation extraction with distant supervision, noisy labels make it difficult to train quality models. Previous neural models addressed this problem using an attention mechanism that attends to sentences that are likely to express the relations. We improve such models by combining the distant supervision data with an additional directly-supervised data, which we use as supervision for the attention weights. We find that joint training on both types of supervision leads to a better model because it improves the models ability to identify noisy sentences. In addition, we find that sigmoidal attention weights with max pooling achieves better performance over the commonly used weighted average attention in this setup. Our proposed method achieves a new state-of-the-art result on the widely used FB-NYT dataset.
Although distant supervision automatically generates training data for relation extraction, it also introduces false-positive (FP) and false-negative (FN) training instances to the generated datasets. Whereas both types of errors degrade the final model performance, previous work on distant supervision denoising focuses more on suppressing FP noise and less on resolving the FN problem. We here propose H-FND, a hierarchical false-negative denoising framework for robust distant supervision relation extraction, as an FN denoising solution. H-FND uses a hierarchical policy which first determines whether non-relation (NA) instances should be kept, discarded, or revised during the training process. For those learning instances which are to be revised, the policy further reassigns them appropriate relations, making them better training inputs. Experiments on SemEval-2010 and TACRED were conducted with controlled FN ratios that randomly turn the relations of training and validation instances into negatives to generate FN instances. In this setting, H-FND can revise FN instances correctly and maintains high F1 scores even when 50% of the instances have been turned into negatives. Experiment on NYT10 is further conducted to shows that H-FND is applicable in a realistic setting.
Distant Supervised Relation Extraction (DSRE) is usually formulated as a problem of classifying a bag of sentences that contain two query entities, into the predefined relation classes. Most existing methods consider those relation classes as distinct semantic categories while ignoring their potential connection to query entities. In this paper, we propose to leverage this connection to improve the relation extraction accuracy. Our key ideas are twofold: (1) For sentences belonging to the same relation class, the expression style, i.e. words choice, can vary according to the query entities. To account for this style shift, the model should adjust its parameters in accordance with entity types. (2) Some relation classes are semantically similar, and the entity types appear in one relation may also appear in others. Therefore, it can be trained cross different relation classes and further enhance those classes with few samples, i.e., long-tail classes. To unify these two arguments, we developed a novel Dynamic Neural Network for Relation Extraction (DNNRE). The network adopts a novel dynamic parameter generator that dynamically generates the network parameters according to the query entity types and relation classes. By using this mechanism, the network can simultaneously handle the style shift problem and enhance the prediction accuracy for long-tail classes. Through our experimental study, we demonstrate the effectiveness of the proposed method and show that it can achieve superior performance over the state-of-the-art methods.
The recent advancement of pre-trained Transformer models has propelled the development of effective text mining models across various biomedical tasks. However, these models are primarily learned on the textual data and often lack the domain knowledge of the entities to capture the context beyond the sentence. In this study, we introduced a novel framework that enables the model to learn multi-omnics biological information about entities (proteins) with the help of additional multi-modal cues like molecular structure. Towards this, rather developing modality-specific architectures, we devise a generalized and optimized graph based multi-modal learning mechanism that utilizes the GraphBERT model to encode the textual and molecular structure information and exploit the underlying features of various modalities to enable end-to-end learning. We evaluated our proposed method on ProteinProtein Interaction task from the biomedical corpus, where our proposed generalized approach is observed to be benefited by the additional domain-specific modality.