No Arabic abstract
Context: Decentralized applications on blockchain platforms are realized through smart contracts. However, participants who lack programming knowledge often have difficulties reading the smart contract source codes, which leads to potential security risks and barriers to participation. Objective: Our objective is to translate the smart contract source codes into natural language descriptions to help people better understand, operate, and learn smart contracts. Method: This paper proposes an automated translation tool for Solidity smart contracts, termed SolcTrans, based on an abstract syntax tree and formal grammar. We have investigated 3,000 smart contracts and determined the part of speeches of corresponding blockchain terms. Among them, we further filtered out contract snippets without detailed comments and left 811 snippets to evaluate the translation quality of SolcTrans. Results: Experimental results show that even with a small corpus, SolcTrans can achieve similar performance to the state-of-the-art code comments generation models for other programming languages. In addition, SolcTrans has consistent performance when dealing with code snippets with different lengths and gas consumption. Conclusion: SolcTrans can correctly interpret Solidity codes and automatically convert them into comprehensible English text. We will release our tool and dataset for supporting reproduction and further studies in related fields.
Recently, the automated translation of source code from one programming language to another by using automatic approaches inspired by Neural Machine Translation (NMT) methods for natural languages has come under study. However, such approaches suffer from the same problem as previous NMT approaches on natural languages, viz. the lack of an ability to estimate and evaluate the quality of the translations; and consequently ascribe some measure of interpretability to the models choices. In this paper, we attempt to estimate the quality of source code translations built on top of the TransCoder model. We consider the code translation task as an analog of machine translation (MT) for natural languages, with some added caveats. We present our main motivation from a user study built around code translation; and present a technique that correlates the confidences generated by that model to lint errors in the translated code. We conclude with some observations on these correlations, and some ideas for future work.
Smart contracts are automated or self-enforcing contracts that can be used to exchange assets without having to place trust in third parties. Many commercial transactions use smart contracts due to their potential benefits in terms of secure peer-to-peer transactions independent of external parties. Experience shows that many commonly used smart contracts are vulnerable to serious malicious attacks which may enable attackers to steal valuable assets of involving parties. There is therefore a need to apply analysis and automated repair techniques to detect and repair bugs in smart contracts before being deployed. In this work, we present the first general-purpose automated smart contract repair approach that is also gas-aware. Our repair method is search-based and searches among mutations of the buggy contract. Our method also considers the gas usage of the candidate patches by leveraging our novel notion of gas dominance relationship. We have made our smart contract repair tool SCRepair available open-source, for investigation by the wider community.
We present a model/executable specification of smart contract execution in Coq. Our formalization allows for inter-contract communication and generalizes existing work by allowing modelling of both depth-first execution blockchains (like Ethereum) and breadth-first execution blockchains (like Tezos). We represent smart contracts programs in Coqs functional language Gallina, enabling easier reasoning about functional correctness of concrete contracts than other approaches. In particular we develop a Congress contract in this style. This contract -- a simplified version of the infamous DAO -- is interesting because of its very dynamic communication pattern with other contracts. We give a high-level partial specification of the Congresss behavior, related to reentrancy, and prove that the Congress satisfies it for all possible smart contract execution orders.
Smart contracts are programs running on a blockchain. They are immutable to change, and hence can not be patched for bugs once deployed. Thus it is critical to ensure they are bug-free and well-designed before deployment. A Contract defect is an error, flaw or fault in a smart contract that causes it to produce an incorrect or unexpected result, or to behave in unintended ways. The detection of contract defects is a method to avoid potential bugs and improve the design of existing code. Since smart contracts contain numerous distinctive features, such as the gas system. decentralized, it is important to find smart contract specified defects. To fill this gap, we collected smart-contract-related posts from Ethereum StackExchange, as well as real-world smart contracts. We manually analyzed these posts and contracts; using them to define 20 kinds of contract defects. We categorized them into indicating potential security, availability, performance, maintainability and reusability problems. To validate if practitioners consider these contract as harmful, we created an online survey and received 138 responses from 32 different countries. Feedback showed these contract defects are harmful and removing them would improve the quality and robustness of smart contracts. We manually identified our defined contract defects in 587 real world smart contract and publicly released our dataset. Finally, we summarized 5 impacts caused by contract defects. These help developers better understand the symptoms of the defects and removal priority.
Recent years have seen the rise of Deep Learning (DL) techniques applied to source code. Researchers have exploited DL to automate several development and maintenance tasks, such as writing commit messages, generating comments and detecting vulnerabilities among others. One of the long lasting dreams of applying DL to source code is the possibility to automate non-trivial coding activities. While some steps in this direction have been taken (e.g., learning how to fix bugs), there is still a glaring lack of empirical evidence on the types of code changes that can be learned and automatically applied by DL. Our goal is to make this first important step by quantitatively and qualitatively investigating the ability of a Neural Machine Translation (NMT) model to learn how to automatically apply code changes implemented by developers during pull requests. We train and experiment with the NMT model on a set of 236k pairs of code components before and after the implementation of the changes provided in the pull requests. We show that, when applied in a narrow enough context (i.e., small/medium-sized pairs of methods before/after the pull request changes), NMT can automatically replicate the changes implemented by developers during pull requests in up to 36% of the cases. Moreover, our qualitative analysis shows that the model is capable of learning and replicating a wide variety of meaningful code changes, especially refactorings and bug-fixing activities. Our results pave the way for novel research in the area of DL on code, such as the automatic learning and applications of refactoring.