ترغب بنشر مسار تعليمي؟ اضغط هنا

Learning Test Traces

116   0   0.0 ( 0 )
 نشر من قبل Roni Stern
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Modern software projects include automated tests written to check the programs functionality. The set of functions invoked by a test is called the trace of the test, and the action of obtaining a trace is called tracing. There are many tracing tools since traces are useful for a variety of software engineering tasks such as test generation, fault localization, and test execution planning. A major drawback in using test traces is that obtaining them, i.e., tracing, can be costly in terms of computational resources and runtime. Prior work attempted to address this in various ways, e.g., by selectively tracing only some of the software components or compressing the trace on-the-fly. However, all these approaches still require building the project and executing the test in order to get its (partial, possibly compressed) trace. This is still very costly in many cases. In this work, we propose a method to predict the trace of each test without executing it, based only on static properties of the test and the tested program, as well as past experience on different tests. This prediction is done by applying supervised learning to learn the relation between various static features of test and function and the likelihood that one will include the other in its trace. Then, we show how to use the predicted traces in a recent automated troubleshooting paradigm called Learn Diagnose and plan (LDP), instead of the actual, costly-to-obtain, test traces. In a preliminary evaluation on real-world open-source projects, we observe that our prediction quality is reasonable. In addition, using our trace predictions in LDP yields almost the same results comparing to when using real traces, while requiring less overhead.



قيم البحث

اقرأ أيضاً

Contract conformance is hard to determine statically, prior to the deployment of large pieces of software. A scalable alternative is to monitor for contract violations post-deployment: once a violation is detected, the trace characterising the offend ing execution is analysed to pinpoint the source of the offence. A major drawback with this technique is that, often, contract violations take time to surface, resulting in long traces that are hard to analyse. This paper proposes a methodology together with an accompanying tool for simplifying traces and assisting contract-violation debugging.
It is integral to test API functions of widely used deep learning (DL) libraries. The effectiveness of such testing requires DL specific input constraints of these API functions. Such constraints enable the generation of valid inputs, i.e., inputs th at follow these DL specific constraints, to explore deep to test the core functionality of API functions. Existing fuzzers have no knowledge of such constraints, and existing constraint extraction techniques are ineffective for extracting DL specific input constraints. To fill this gap, we design and implement a document guided fuzzing technique, D2C, for API functions of DL libraries. D2C leverages sequential pattern mining to generate rules for extracting DL specific constraints from API documents and uses these constraints to guide the fuzzing to generate valid inputs automatically. D2C also generates inputs that violate these constraints to test the input validity checking code. In addition, D2C uses the constraints to generate boundary inputs to detect more bugs. Our evaluation of three popular DL libraries (TensorFlow, PyTorch, and MXNet) shows that D2Cs accuracy in extracting input constraints is 83.3% to 90.0%. D2C detects 121 bugs, while a baseline fuzzer without input constraints detects only 68 bugs. Most (89) of the 121 bugs are previously unknown, 54 of which have been fixed or confirmed by developers after we report them. In addition, D2C detects 38 inconsistencies within documents, including 28 that are fixed or confirmed after we report them.
Software testing is an essential part of the software lifecycle and requires a substantial amount of time and effort. It has been estimated that software developers spend close to 50% of their time on testing the code they write. For these reasons, a long standing goal within the research community is to (partially) automate software testing. While several techniques and tools have been proposed to automatically generate test methods, recent work has criticized the quality and usefulness of the assert statements they generate. Therefore, we employ a Neural Machine Translation (NMT) based approach called Atlas(AuTomatic Learning of Assert Statements) to automatically generate meaningful assert statements for test methods. Given a test method and a focal method (i.e.,the main method under test), Atlas can predict a meaningful assert statement to assess the correctness of the focal method. We applied Atlas to thousands of test methods from GitHub projects and it was able to predict the exact assert statement manually written by developers in 31% of the cases when only considering the top-1 predicted assert. When considering the top-5 predicted assert statements, Atlas is able to predict exact matches in 50% of the cases. These promising results hint to the potential usefulness ofour approach as (i) a complement to automatic test case generation techniques, and (ii) a code completion support for developers, whocan benefit from the recommended assert statements while writing test code.
Machine learning may enable the automated generation of test oracles. We have characterized emerging research in this area through a systematic literature review examining oracle types, researcher goals, the ML techniques applied, how the generation process was assessed, and the open research challenges in this emerging field. Based on a sample of 22 relevant studies, we observed that ML algorithms generated test verdict, metamorphic relation, and - most commonly - expected output oracles. Almost all studies employ a supervised or semi-supervised approach, trained on labeled system executions or code metadata - including neural networks, support vector machines, adaptive boosting, and decision trees. Oracles are evaluated using the mutation score, correct classifications, accuracy, and ROC. Work-to-date show great promise, but there are significant open challenges regarding the requirements imposed on training data, the complexity of modeled functions, the ML algorithms employed - and how they are applied - the benchmarks used by researchers, and replicability of the studies. We hope that our findings will serve as a roadmap and inspiration for researchers in this field.
Deep Learning (DL) components are routinely integrated into software systems that need to perform complex tasks such as image or natural language processing. The adequacy of the test data used to test such systems can be assessed by their ability to expose artificially injected faults (mutations) that simulate real DL faults. In this paper, we describe an approach to automatically generate new test inputs that can be used to augment the existing test set so that its capability to detect DL mutations increases. Our tool DeepMetis implements a search based input generation strategy. To account for the non-determinism of the training and the mutation processes, our fitness function involves multiple instances of the DL model under test. Experimental results show that tool is effective at augmenting the given test set, increasing its capability to detect mutants by 63% on average. A leave-one-out experiment shows that the augmented test set is capable of exposing unseen mutants, which simulate the occurrence of yet undetected faults.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا