No Arabic abstract
In this paper, we present a systematic approach that transforms the program execution trace into the frequency domain and precisely identifies program phases. The analyzed results can be embedded into program code to mark the starting point and execution characteristics, such as CPI (Cycles per Instruction), of each phase. The so generated information can be applied to runtime program phase prediction. With the precise program phase information, more intelligent software and system optimization techniques can be further explored and developed.
The advance in machine learning (ML)-driven natural language process (NLP) points a promising direction for automatic bug fixing for software programs, as fixing a buggy program can be transformed to a translation task. While software programs contain much richer information than one-dimensional natural language documents, pioneering work on using ML-driven NLP techniques for automatic program repair only considered a limited set of such information. We hypothesize that more comprehensive information of software programs, if appropriately utilized, can improve the effectiveness of ML-driven NLP approaches in repairing software programs. As the first step towards proving this hypothesis, we propose a unified representation to capture the syntax, data flow, and control flow aspects of software programs, and devise a method to use such a representation to guide the transformer model from NLP in better understanding and fixing buggy programs. Our preliminary experiment confirms that the more comprehensive information of software programs used, the better ML-driven NLP techniques can perform in fixing bugs in these programs.
Automated program repair (APR) has attracted great research attention, and various techniques have been proposed. Search-based APR is one of the most important categories among these techniques. Existing researches focus on the design of effective mutation operators and searching algorithms to better find the correct patch. Despite various efforts, the effectiveness of these techniques are still limited by the search space explosion problem. One of the key factors attribute to this problem is the quality of fault spaces as reported by existing studies. This motivates us to study the importance of the fault space to the success of finding a correct patch. Our empirical study aims to answer three questions. Does the fault space significantly correlate with the performance of search-based APR? If so, are there any indicative measurements to approximate the accuracy of the fault space before applying expensive APR techniques? Are there any automatic methods that can improve the accuracy of the fault space? We observe that the accuracy of the fault space affects the effectiveness and efficiency of search-based APR techniques, e.g., the failure rate of GenProg could be as high as $60%$ when the real fix location is ranked lower than 10 even though the correct patch is in the search space. Besides, GenProg is able to find more correct patches and with fewer trials when given a fault space with a higher accuracy. We also find that the negative mutation coverage, which is designed in this study to measure the capability of a test suite to kill the mutants created on the statements executed by failing tests, is the most indicative measurement to estimate the efficiency of search-based APR. Finally, we confirm that automated generated test cases can help improve the accuracy of fault spaces, and further improve the performance of search-based APR techniques.
We introduce Causal Program Dependence Analysis (CPDA), a dynamic dependence analysis that applies causal inference to model the strength of program dependence relations in a continuous space. CPDA observes the association between program elements by constructing and executing modifi
This article introduces an effective generalization of the polar flavor of the Fourier Theorem based on a new method of analysis. Under the premises of the new theory an ample class of functions become viable as bases, with the further advantage of using the same basis for analysis and reconstruction. In fact other tools, like the wavelets, admit specially built nonorthogonal bases but require different bases for analysis and reconstruction (biorthogonal and dual bases) and vectorial coordinates; this renders those systems unintuitive and computing intensive. As an example of the advantages of the new generalization of the Fourier Theorem, this paper introduces a novel method for the synthesis that is based on frequency-phase series of square waves (the equivalent of the polar Fourier Theorem but for nonorthogonal bases). The resulting synthesizer is very efficient needing only few components, frugal in terms of computing needs, and viable for many applications.
In Spectrum-Based Fault Localization (SBFL), a suspiciousness score is assigned to each code element based on test coverage and test outcomes. The scores are then used to rank the code elements relative to each other in order to aid the programmer during the debugging process when seeking the source of a fault. However, probably none of the known SBFL formulae are guaranteed to produce different scores for all the program elements, hence ties emerge between the code elements. Based on our experiments, ties in SBFL are prevalent: in Defects4J, 54-56% of buggy methods are members of ties, i.e., there is at least one other method with the same score in these cases (but typically much more, on average 6), and this inevitably reduces the effectiveness of any SBFL approach. In this work, we present a technique to break ties in such cases based on the so-called method calls frequencies. This counts the number of different contexts of method calls (both as callees and as callers) in failing test cases. The intuition is that if a method appears in many different calling contexts during a failing test case, it will be more suspicious and get a higher rank position compared to other methods with the same scores. This method can be applied to any underlying SBFL formula, and can favourably break the occurring ranks in the ties in many cases. The experimental results show that our novel tie-breaking strategy achieved a significant reduction in both size and number of critical ties in our benchmark. In 72-73% of the cases, the ties were completely eliminated and the average reduction rate was more than 80%.