Do you want to publish a course? Click here

For many tasks, state-of-the-art results have been achieved with Transformer-based architectures, resulting in a paradigmatic shift in practices from the use of task-specific architectures to the fine-tuning of pre-trained language models. The ongoin g trend consists in training models with an ever-increasing amount of data and parameters, which requires considerable resources. It leads to a strong search to improve resource efficiency based on algorithmic and hardware improvements evaluated only for English. This raises questions about their usability when applied to small-scale learning problems, for which a limited amount of training data is available, especially for under-resourced languages tasks. The lack of appropriately sized corpora is a hindrance to applying data-driven and transfer learning-based approaches with strong instability cases. In this paper, we establish a state-of-the-art of the efforts dedicated to the usability of Transformer-based models and propose to evaluate these improvements on the question-answering performances of French language which have few resources. We address the instability relating to data scarcity by investigating various training strategies with data augmentation, hyperparameters optimization and cross-lingual transfer. We also introduce a new compact model for French FrALBERT which proves to be competitive in low-resource settings.
Recently Graph Neural Network (GNN) has been used as a promising tool in multi-hop question answering task. However, the unnecessary updations and simple edge constructions prevent an accurate answer span extraction in a more direct and interpretable way. In this paper, we propose a novel model of Breadth First Reasoning Graph (BFR-Graph), which presents a new message passing way that better conforms to the reasoning process. In BFR-Graph, the reasoning message is required to start from the question node and pass to the next sentences node hop by hop until all the edges have been passed, which can effectively prevent each node from over-smoothing or being updated multiple times unnecessarily. To introduce more semantics, we also define the reasoning graph as a weighted graph with considering the number of co-occurrence entities and the distance between sentences. Then we present a more direct and interpretable way to aggregate scores from different levels of granularity based on the GNN. On HotpotQA leaderboard, the proposed BFR-Graph achieves state-of-the-art on answer span prediction.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا