Towards Enhancing Database Education: Natural Language Generation Meets Query Execution Plans


Abstract in English

The database systems course is offered as part of an undergraduate computer science degree program in many major universities. A key learning goal of learners taking such a course is to understand how SQL queries are processed in a RDBMS in practice. Since a query execution plan (QEP) describes the execution steps of a query, learners can acquire the understanding by perusing the QEPs generated by a RDBMS. Unfortunately, in practice, it is often daunting for a learner to comprehend these QEPs containing vendor-specific implementation details, hindering her learning process. In this paper, we present a novel, end-to-end, generic system called lantern that generates a natural language description of a qep to facilitate understanding of the query execution steps. It takes as input an SQL query and its QEP, and generates a natural language description of the execution strategy deployed by the underlying RDBMS. Specifically, it deploys a declarative framework called pool that enables subject matter experts to efficiently create and maintain natural language descriptions of physical operators used in QEPs. A rule-based framework called RULE-LANTERN is proposed that exploits pool to generate natural language descriptions of QEPs. Despite the high accuracy of RULE-LANTERN, our engagement with learners reveal that, consistent with existing psychology theories, perusing such rule-based descriptions lead to boredom due to repetitive statements across different QEPs. To address this issue, we present a novel deep learning-based language generation framework called NEURAL-LANTERN that infuses language variability in the generated description by exploiting a set of paraphrasing tools and word embedding. Our experimental study with real learners shows the effectiveness of lantern in facilitating comprehension of QEPs.

Download