ترغب بنشر مسار تعليمي؟ اضغط هنا

71 - Jin Xu , Yinuo Guo , Junfeng Hu 2020
Copying mechanism has been commonly used in neural paraphrasing networks and other text generation tasks, in which some important words in the input sequence are preserved in the output sequence. Similarly, in machine translation, we notice that ther e are certain words or phrases appearing in all good translations of one source text, and these words tend to convey important semantic information. Therefore, in this work, we define words carrying important semantic meanings in sentences as semantic core words. Moreover, we propose an MT evaluation approach named Semantically Weighted Sentence Similarity (SWSS). It leverages the power of UCCA to identify semantic core words, and then calculates sentence similarity scores on the overlap of semantic core words. Experimental results show that SWSS can consistently improve the performance of popular MT evaluation metrics which are based on lexical similarity.
105 - Zhaocheng Zhu , Junfeng Hu 2017
Recently, doc2vec has achieved excellent results in different tasks. In this paper, we present a context aware variant of doc2vec. We introduce a novel weight estimating mechanism that generates weights for each word occurrence according to its contr ibution in the context, using deep neural networks. Our context aware model can achieve similar results compared to doc2vec initialized byWikipedia trained vectors, while being much more efficient and free from heavy external corpus. Analysis of context aware weights shows they are a kind of enhanced IDF weights that capture sub-topic level keywords in documents. They might result from deep neural networks that learn hidden representations with the least entropy.
The problem of community detection receives great attention in recent years. Many methods have been proposed to discover communities in networks. In this paper, we propose a Gaussian stochastic blockmodel that uses Gaussian distributions to fit weigh t of edges in networks for non-overlapping community detection. The maximum likelihood estimation of this model has the same objective function as general label propagation with node preference. The node preference of a specific vertex turns out to be a value proportional to the intra-community eigenvector centrality (the corresponding entry in principal eigenvector of the adjacency matrix of the subgraph inside that vertexs community) under maximum likelihood estimation. Additionally, the maximum likelihood estimation of a constrained version of our model is highly related to another extension of label propagation algorithm, namely, the label propagation algorithm under constraint. Experiments show that the proposed Gaussian stochastic blockmodel performs well on various benchmark networks.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا