ﻻ يوجد ملخص باللغة العربية
Code summarization aims to generate concise natural language descriptions of source code, which can help improve program comprehension and maintenance. Recent studies show that syntactic and structural information extracted from abstract syntax trees (ASTs) is conducive to summary generation. However, existing approaches fail to fully capture the rich information in ASTs because of the large size/depth of ASTs. In this paper, we propose a novel model CAST that hierarchically splits and reconstructs ASTs. First, we hierarchically split a large AST into a set of subtrees and utilize a recursive neural network to encode the subtrees. Then, we aggregate the embeddings of subtrees by reconstructing the split ASTs to get the representation of the complete AST. Finally, AST representation, together with source code embedding obtained by a vanilla code token encoder, is used for code summarization. Extensive experiments, including the ablation study and the human evaluation, on benchmarks have demonstrated the power of CAST. To facilitate reproducibility, our code and data are available at https://anonymous.4open.science/r/CAST/.
Code completion has become an essential component of integrated development environments. Contemporary code completion methods rely on the abstract syntax tree (AST) to generate syntactically correct code. However, they cannot fully capture the seque
Software developers write a lot of source code and documentation during software development. Intrinsically, developers often recall parts of source code or code summaries that they had written in the past while implementing software or documenting t
Commit messages record code changes (e.g., feature modifications and bug repairs) in natural language, and are useful for program comprehension. Due to the frequent updates of software and time cost, developers are generally unmotivated to write comm
Source code summaries are short natural language descriptions of code snippets that help developers better understand and maintain source code. There has been a surge of work on automatic code summarization to reduce the burden of writing summaries m
Source code summaries are important for the comprehension and maintenance of programs. However, there are plenty of programs with missing, outdated, or mismatched summaries. Recently, deep learning techniques have been exploited to automatically gene