Do you want to publish a course? Click here

We propose to tackle data-to-text generation tasks by directly splicing together retrieved segments of text from neighbor'' source-target pairs. Unlike recent work that conditions on retrieved neighbors but generates text token-by-token, left-to-righ t, we learn a policy that directly manipulates segments of neighbor text, by inserting or replacing them in partially constructed generations. Standard techniques for training such a policy require an oracle derivation for each generation, and we prove that finding the shortest such derivation can be reduced to parsing under a particular weighted context-free grammar. We find that policies learned in this way perform on par with strong baselines in terms of automatic and human evaluation, but allow for more interpretable and controllable generation.
Non-parametric neural language models (NLMs) learn predictive distributions of text utilizing an external datastore, which allows them to learn through explicitly memorizing the training datapoints. While effective, these models often require retriev al from a large datastore at test time, significantly increasing the inference overhead and thus limiting the deployment of non-parametric NLMs in practical applications. In this paper, we take the recently proposed k-nearest neighbors language model as an example, exploring methods to improve its efficiency along various dimensions. Experiments on the standard WikiText-103 benchmark and domain-adaptation datasets show that our methods are able to achieve up to a 6x speed-up in inference speed while retaining comparable performance. The empirical analysis we present may provide guidelines for future research seeking to develop or deploy more efficient non-parametric NLMs.
Emotion detection is an important task that can be applied to social media data to discover new knowledge. While the use of deep learning methods for this task has been prevalent, they are black-box models, making their decisions hard to interpret fo r a human operator. Therefore, in this paper, we propose an approach using weighted k Nearest Neighbours (kNN), a simple, easy to implement, and explainable machine learning model. These qualities can help to enhance results' reliability and guide error analysis. In particular, we apply the weighted kNN model to the shared emotion detection task in tweets from SemEval-2018. Tweets are represented using different text embedding methods and emotion lexicon vocabulary scores, and classification is done by an ensemble of weighted kNN models. Our best approaches obtain results competitive with state-of-the-art solutions and open up a promising alternative path to neural network methods.
Linear regression methods impose strong constraints on regression models, especially on the error terms where it assumes that it is independent and follows normal distribution, and this may not be satisfied in many studies, leading to bias that can not be ignored from the actual model, which affects the credibility of the study. We present in this paper the problem of estimating the regression function using the Nadarya Watson kernel and k- nearest neighbor estimators as alternatives to the parametric linear regression estimators through a simulation study on an imposed model, where we conducted a comparative study between these methods using the statistical programming language R in order to know the best of these estimations. Where the mean squares errors (MSE) was used to determine the best estimate. The results of the simulation study also indicate the effectiveness and efficiency of the nonparametric in the representation of the regression function as compared to linear regression estimators, and indicate the convergence of the performance of these two estimates.
In this paper, comparisons between several mathematical interpolation methods applied on high accuracy and huge laser clouds which only represents the DTM. In order to implicate the aforementioned, a group of a variety of Laser Scanned Areas has been chosen to represent different types of terrain including complex and flat terrain, taking into account that man-made features are not involved in this study and that different Laser Clouds density is used to make the study more general. A different set of algorithms were applied to conclude which one is more suitable. This step was followed by the comparison between different interpolation results. The results have shown that the Points' Density has a great impact on a manner in which the optimal interpolation method is applied. Moreover, it has proven that the Nearest Neighbor Algorithm is the best applied method compared with the other alternatives
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا