Do you want to publish a course? Click here

Learning and Evaluating Chinese Idiom Embeddings

التعلم وتقييم embeddings الصينية

438   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

We study the task of learning and evaluating Chinese idiom embeddings. We first construct a new evaluation dataset that contains idiom synonyms and antonyms. Observing that existing Chinese word embedding methods may not be suitable for learning idiom embeddings, we further present a BERT-based method that directly learns embedding vectors for individual idioms. We empirically compare representative existing methods and our method. We find that our method substantially outperforms existing methods on the evaluation dataset we have constructed.



References used
https://aclanthology.org/
rate research

Read More

Gender bias in word embeddings gradually becomes a vivid research field in recent years. Most studies in this field aim at measurement and debiasing methods with English as the target language. This paper investigates gender bias in static word embed dings from a unique perspective, Chinese adjectives. By training word representations with different models, the gender bias behind the vectors of adjectives is assessed. Through a comparison between the produced results and a human scored data set, we demonstrate how gender bias encoded in word embeddings differentiates from people's attitudes.
Recent researches show that pre-trained models (PTMs) are beneficial to Chinese Word Segmentation (CWS). However, PTMs used in previous works usually adopt language modeling as pre-training tasks, lacking task-specific prior segmentation knowledge an d ignoring the discrepancy between pre-training tasks and downstream CWS tasks. In this paper, we propose a CWS-specific pre-trained model MetaSeg, which employs a unified architecture and incorporates meta learning algorithm into a multi-criteria pre-training task. Empirical results show that MetaSeg could utilize common prior segmentation knowledge from different existing criteria and alleviate the discrepancy between pre-trained models and downstream CWS tasks. Besides, MetaSeg can achieve new state-of-the-art performance on twelve widely-used CWS datasets and significantly improve model performance in low-resource settings.
This paper is an analytical study of tourism plans of China’s Hainan Province and the Syrian coastal region before the Syrian war commenced in 2011. It compares the two types of tourism (in the Mediterranean and the Asia Pacific) and concludes with a n integrated model of successful regional tourism. The focus is on strategic plans of the last two decades and how they facilitate tourism, specifically, how strategic plans can be translated into sustainable tourism development projects in both regions. The strategic plan in the Chinese case considers environmental, economic, institutional, and social characteristics of tourism development, which determines the necessary infrastructure and environment for the further development of tourism. This is contrasted with the absence of such a strategy in the case of Syria. Data were drawn from in-depth interviews, field observations, and document analysis. Qualitative research techniques were used to analyze the available data and to form a detailed description of the past, present, and future potential for tourism in the two regions. The study measures transportation, land use/land cover patterns, tourism and tourism development, urban development, and strategic plans. It applied the lessons learned from Chinese tourism innovations in Hainan to propose an executive plan for sustainable tourism development in the Syrian coastal region once the present war has ended. This requires the active participation of all relevant stakeholders from almost every domain despite differing interests. It further requires improving the integration of three separate developmental factors (social, environmental, and economic) as complementary rather than conflicting elements. Keywords; Syrian coastal region, Syria, Hainan province, China, strategic plans, development, policy, planning.
In this paper, we aim to address the challenges surrounding the translation of ancient Chinese text: (1) The linguistic gap due to the difference in eras results in translations that are poor in quality, and (2) most translations are missing the cont extual information that is often very crucial to understanding the text. To this end, we improve upon past translation techniques by proposing the following: We reframe the task as a multi-label prediction task where the model predicts both the translation and its particular era. We observe that this helps to bridge the linguistic gap as chronological context is also used as auxiliary information. We validate our framework on a parallel corpus annotated with chronology information and show experimentally its efficacy in producing quality translation outputs. We release both the code and the data for future research.
Communication between human and mobile agents is getting increasingly important as such agents are widely deployed in our daily lives. Vision-and-Dialogue Navigation is one of the tasks that evaluate the agent's ability to interact with humans for as sistance and navigate based on natural language responses. In this paper, we explore the Navigation from Dialogue History (NDH) task, which is based on the Cooperative Vision-and-Dialogue Navigation (CVDN) dataset, and present a state-of-the-art model which is built upon Vision-Language transformers. However, despite achieving competitive performance, we find that the agent in the NDH task is not evaluated appropriately by the primary metric -- Goal Progress. By analyzing the performance mismatch between Goal Progress and other metrics (e.g., normalized Dynamic Time Warping) from our state-of-the-art model, we show that NDH's sub-path based task setup (i.e., navigating partial trajectory based on its correspondent subset of the full dialogue) does not provide the agent with enough supervision signal towards the goal region. Therefore, we propose a new task setup called NDH-Full which takes the full dialogue and the whole navigation path as one instance. We present a strong baseline model and show initial results on this new task. We further describe several approaches that we try, in order to improve the model performance (based on curriculum learning, pre-training, and data-augmentation), suggesting potential useful training methods on this new NDH-Full task.

suggested questions

comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا