We create a large-scale dialogue corpus that provides pragmatic paraphrases to advance technology for understanding the underlying intentions of users. While neural conversation models acquire the ability to generate fluent responses through training
on a dialogue corpus, previous corpora have mainly focused on the literal meanings of utterances. However, in reality, people do not always present their intentions directly. For example, if a person said to the operator of a reservation service I don't have enough budget.'', they, in fact, mean please find a cheaper option for me.'' Our corpus provides a total of 71,498 indirect--direct utterance pairs accompanied by a multi-turn dialogue history extracted from the MultiWoZ dataset. In addition, we propose three tasks to benchmark the ability of models to recognize and generate indirect and direct utterances. We also investigated the performance of state-of-the-art pre-trained models as baselines.
The main purpose of the present research is to support Arabic Text- to - Speech synthesizers, with
natural prosody, based on linguistic analysis of texts to synthesize, and automatic prosody generation,
using rules which are deduced from recorded s
ignals analysis, of different types of sentences in Arabic. All
the types of Arabic sentences (declarative and constructive) were enumerated with the help of an expert in
Arabic linguistics . A textual corpus of about 2500 sentences covering most of these types was built and
recorded both in natural prosody and without prosody. Later, these sentences were analyzed to extract
prosody effect on the signal parameters, and to build prosody generation rules. In this paper, we present
the results on negation sentences, applied on synthesized speech using the open source tool MBROLA. The
results can be used with any parametric Arabic synthesizer. Future work will apply the rules on a new
Arabic synthesizer based on semi-syllables units, which is under development in the Higher Institute for
Applied Sciences and Technology.