Do you want to publish a course? Click here

Impressive milestones have been achieved in text matching by adopting a cross-attention mechanism to capture pertinent semantic connections between two sentence representations. However, regular cross-attention focuses on word-level links between the two input sequences, neglecting the importance of contextual information. We propose a context-aware interaction network (COIN) to properly align two sequences and infer their semantic relationship. Specifically, each interaction block includes (1) a context-aware cross-attention mechanism to effectively integrate contextual information when aligning two sequences, and (2) a gate fusion layer to flexibly interpolate aligned representations. We apply multiple stacked interaction blocks to produce alignments at different levels and gradually refine the attention results. Experiments on two question matching datasets and detailed analyses demonstrate the effectiveness of our model.
Product quantization (PQ) is a widely used technique for ad-hoc retrieval. Recent studies propose supervised PQ, where the embedding and quantization models can be jointly trained with supervised learning. However, there is a lack of appropriate form ulation of the joint training objective; thus, the improvements over previous non-supervised baselines are limited in reality. In this work, we propose the Matching-oriented Product Quantization (MoPQ), where a novel objective Multinoulli Contrastive Loss (MCL) is formulated. With the minimization of MCL, we are able to maximize the matching probability of query and ground-truth key, which contributes to the optimal retrieval accuracy. Given that the exact computation of MCL is intractable due to the demand of vast contrastive samples, we further propose the Differentiable Cross-device Sampling (DCS), which significantly augments the contrastive samples for precise approximation of MCL. We conduct extensive experimental studies on four real-world datasets, whose results verify the effectiveness of MoPQ. The code is available at https://github.com/microsoft/MoPQ.
SemEval task 4 aims to find a proper option from multiple candidates to resolve the task of machine reading comprehension. Most existing approaches propose to concat question and option together to form a context-aware model. However, we argue that s traightforward concatenation can only provide a coarse-grained context for the MRC task, ignoring the specific positions of the option relative to the question. In this paper, we propose a novel MRC model by filling options into the question to produce a fine-grained context (defined as summary) which can better reveal the relationship between option and question. We conduct a series of experiments on the given dataset, and the results show that our approach outperforms other counterparts to a large extent.
We introduce SPARTA, a novel neural retrieval method that shows great promise in performance, generalization, and interpretability for open-domain question answering. Unlike many neural ranking methods that use dense vector nearest neighbor search, S PARTA learns a sparse representation that can be efficiently implemented as an Inverted Index. The resulting representation enables scalable neural retrieval that does not require expensive approximate vector search and leads to better performance than its dense counterpart. We validated our approaches on 4 open-domain question answering (OpenQA) tasks and 11 retrieval question answering (ReQA) tasks. SPARTA achieves new state-of-the-art results across a variety of open-domain question answering tasks in both English and Chinese datasets, including open SQuAD, CMRC and etc. Analysis also confirms that the proposed method creates human interpretable representation and allows flexible control over the trade-off between performance and efficiency.
In this paper, we focus on the problem of keyword and document matching by considering different relevance levels. In our recommendation system, different people follow different hot keywords with interest. We need to attach documents to each keyword and then distribute the documents to people who follow these keywords. The ideal documents should have the same topic with the keyword, which we call topic-aware relevance. In other words, topic-aware relevance documents are better than partially-relevance ones in this application. However, previous tasks never define topic-aware relevance clearly. To tackle this problem, we define a three-level relevance in keyword-document matching task: topic-aware relevance, partially-relevance and irrelevance. To capture the relevance between the short keyword and the document at above-mentioned three levels, we should not only combine the latent topic of the document with its deep neural representation, but also model complex interactions between the keyword and the document. To this end, we propose a Two-stage Interaction and Topic-Aware text matching model (TITA). In terms of topic-aware'', we introduce neural topic model to analyze the topic of the document and then use it to further encode the document. In terms of two-stage interaction'', we propose two successive stages to model complex interactions between the keyword and the document. Extensive experiments reveal that TITA outperforms other well-designed baselines and shows excellent performance in our recommendation system.
Keyword extraction is the task of identifying words (or multi-word expressions) that best describe a given document and serve in news portals to link articles of similar topics. In this work, we develop and evaluate our methods on four novel data set s covering less-represented, morphologically-rich languages in European news media industry (Croatian, Estonian, Latvian, and Russian). First, we perform evaluation of two supervised neural transformer-based methods, Transformer-based Neural Tagger for Keyword Identification (TNT-KID) and Bidirectional Encoder Representations from Transformers (BERT) with an additional Bidirectional Long Short-Term Memory Conditional Random Fields (BiLSTM CRF) classification head, and compare them to a baseline Term Frequency - Inverse Document Frequency (TF-IDF) based unsupervised approach. Next, we show that by combining the keywords retrieved by both neural transformer-based methods and extending the final set of keywords with an unsupervised TF-IDF based technique, we can drastically improve the recall of the system, making it appropriate for usage as a recommendation system in the media house environment.
IOT sensors use the publish/subscribe model for communication to benefit from its decoupled nature with respect to space, time, and synchronization. Because of the heterogeneity of communicating parties, semantic decoupling is added as a fourth di mension. The added semantic decoupling complicates the matching process and reduces its efficiency. The proposed algorithm clusters subscriptions and events according to topic and performs the matching process within these clusters, which increases the throughput by reducing the matching time . Moreover, the accuracy of matching is improved when subscriptions must be fully approximated . This work shows the benefit of clustering, as well as the improvement in the matching accuracy and efficiency achieved using this approach.
Service Oriented Computing (SOC) is changing the way of developing software systems. Each web service has a specific purpose to serve, so it can not satisfy users’ request. In this paper, we propose a Web services composition method based on OWL on tology, and design an automatic system model for services discovery and composition. This method uses domain ontology and WordNet to calculate matching between input and output parameters and uses Category ontology to solve the problem of semantic heterogeneity in web service description. We use services with single input and single output and cost as QoS criteria. This method can enhance the efficiency and accuracy of service composition, and the experiments are used to validate and analyze the proposed system.
Exam ) objective tests) have an important role in the evaluation of student's intended learning outcomes, and estimates the level of their meeting of the sought goals. Therefore, the skill of formulating exam tests is one of the main criteria that should be within criteria of the evaluation quality. An objective test is one of the sorts that offers a student with a number of answers to a question or solutions to a problem, and then the student should identify (recognize) the right answer or solution to these questions. They are called "Objective Tests" because they are objectively corrected, in other meaning, its correction doesn't differ from one evaluator to another. Because the importance of formulating this type of questions and scarcity of researches concerned this idea. This research aimed to evaluate the achievement of the formulated objective tests for formulated objective test conditions. The taken research sample was the final written exam (second term) which represented the courses of the four years that are taught by faculty staff members at Nursing Faculty for the academic year 2014 -2015. The sample was consisted of 719 multiple-choice question, 248 True/False item, and 21 Match item. The most significant findings were: the faculty members followed the conditions of formulating stem in multiple-choice questions with the need to pay attention to language clarity and accuracy. In addition, they followed the conditions in formulating alternatives in multiple-choice tests, and follow all test-type formulation conditions for True/False items, with relying heavily on multiple-choice questions and True/False items, while relying on the test of match items (proportionality) type was very few.
Response spectrum analysis and equivalent static analysis is widely used by engineers and engineering offices to estimate buildings and structures response to earthquakes. But performance based procedures to evaluate buildings and new designs acco rding to Syrian code and other international codes require response analysis using smallest of earthquake records, where we can estimate engineeringdemandparameters(EDPs)— floordisplacements,storydrifts,memberforces,memberdeformations,etc.— ofbuildingsandspecialstructuressubjectedtogroundmotions, consecutively to verify required performance criteria. Theserecordsshouldbeproperlyselectedandscaledincompliancewithsitespecifichazardconditionstoestimate (EDPs) and ensure that they verify ―expected‖ median demands. In this study, background, selection procedures compatible with Syrian code, and review of most scaling methods were introduced. The structural response was studied by comparing displacements due to response spectrum analysis, scaled records using PGA, and synthetic time histories records in time domain and frequency domain (generated according to Syrian response spectrum). Tow three-dimensional models of real buildings in Lattakia city were used as study cases, the results obtained by 20 analysis processes. The results show that analysis using synthetic records compatible with Syrian code give noticeably less displacements estimates comparing with response spectrum analysis and analysis using records scaled by PGA scaling.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا