Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Interactive Duplicate Search in Software Documentation

265 0 0.0 ( 0 )

Download Cite

Added by Dmitrij Koznov Mr

Publication date 2019

fields Informatics Engineering

and research's language is English

Authors D.V. Luciv - D.V. Koznov - A.A. Shelikhovskii

Software Engineering Data Structures and Algorithms

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Various software features such as classes, methods, requirements, and tests often have similar functionality. This can lead to emergence of duplicates in their descriptive documentation. Uncontrolled duplicates created via copy/paste hinder the process of documentation maintenance. Therefore, the task of duplicate detection in software documentation is of importance. Solving it makes planned reuse possible, as well as creating and using templates for unification and automatic generation of documentation. In this paper, we present an interactive process for duplicate detection that involves the user in order to conduct meaningful search. It includes a new formal definition of a near duplicate, a pattern-based, and the proof of its completeness. Moreover, we demonstrate the results of experimenting on a collection of documents of several industrial projects.

rate research

Documentation Generation as Information Visualization

92 - Will Crichton 2020

Automatic documentation generation tools, or auto docs, are widely used to visualize information about APIs. However, each auto doc tool comes with its own unique representation of API information. In this paper, I use an information visualization analysis of auto docs to generate potential design principles for improving their usability. Developers use auto docs as a reference by looking up relevant API primitives given partial information, or leads, about its name, type, or behavior. I discuss how auto docs can better support searching and scanning on these leads, e.g. by providing more information-dense visualizations of method signatures.

Software Engineering Human-Computer Interaction

Analyzing Web Search Behavior for Software Engineering Tasks

322 - Nikitha Rao , Chetan Bansal , Thomas Zimmermann 2019

Web search plays an integral role in software engineering (SE) to help with various tasks such as finding documentation, debugging, installation, etc. In this work, we present the first large-scale analysis of web search behavior for SE tasks using the search query logs from Bing, a commercial web search engine. First, we use distant supervision techniques to build a machine learning classifier to extract the SE search queries with an F1 score of 93%. We then perform an analysis on one million search sessions to understand how software engineering related queries and sessions differ from other queries and sessions. Subsequently, we propose a taxonomy of intents to identify the various contexts in which web search is used in software engineering. Lastly, we analyze millions of SE queries to understand the distribution, search metrics and trends across these SE search intents. Our analysis shows that SE related queries form a significant portion of the overall web search traffic. Additionally, we found that there are six major intent categories for which web search is used in software engineering. The techniques and insights can not only help improve existing tools but can also inspire the development of new tools that aid in finding information for SE related tasks.

Software Engineering Information Retrieval

Duplicate-sensitivity Guided Transformation Synthesis for DBMS Correctness Bug Detection

190 - Yushan Zhang , Peisen Yao , Rongxin Wu 2021

Database Management System (DBMS) plays a core role in modern software from mobile apps to online banking. It is critical that DBMS should provide correct data to all applications. When the DBMS returns incorrect data, a correctness bug is triggered. Current production-level DBMSs still suffer from insufficient testing due to the limited hand-written test cases. Recently several works proposed to automatically generate many test cases with query transformation, a process of generating an equivalent query pair and testing a DBMS by checking whether the system returns the same result set for both queries. However, all of them still heavily rely on manual work to provide a transformation which largely confines their exploration of the valid input query space. This paper introduces duplicate-sensitivity guided transformation synthesis which automatically finds new transformations by first synthesizing many candidates then filtering the nonequivalent ones. Our automated synthesis is achieved by mutating a query while keeping its duplicate sensitivity, which is a necessary condition for query equivalence. After candidate synthesis, we keep the mutant query which is equivalent to the given one by using a query equivalent checker. Furthermore, we have implemented our idea in a tool Eqsql and used it to test the production-level DBMSs. In two months, we detected in total 30 newly confirmed and unique bugs in MySQL, TiDB and CynosDB.

Software Engineering Databases Programming Languages

An Empirical Study of Software Exceptions in the Field using Search Logs

353 - Foyzul Hassan , Chetan Bansal , Nachiappan Nagappan 2020

Software engineers spend a substantial amount of time using Web search to accomplish software engineering tasks. Such search tasks include finding code snippets, API documentation, seeking help with debugging, etc. While debugging a bug or crash, one of the common practices of software engineers is to search for information about the associated error or exception traces on the internet. In this paper, we analyze query logs from a leading commercial general-purpose search engine (GPSE) such as Google, Yahoo! or Bing to carry out a large scale study of software exceptions. To the best of our knowledge, this is the first large scale study to analyze how Web search is used to find information about exceptions. We analyzed about 1 million exception related search queries from a random sample of 5 billion web search queries. To extract exceptions from unstructured query text, we built a novel and high-performance machine learning model with a F1-score of 0.82. Using the machine learning model, we extracted exceptions from raw queries and performed popularity, effort, success, query characteristic and web domain analysis. We also performed programming language-specific analysis to give a better view of the exception search behavior. These techniques can help improve existing methods, documentation and tools for exception analysis and prediction. Further, similar techniques can be applied for APIs, frameworks, etc.

Software Engineering Information Retrieval

Software Estimations Risk in Pakistan Software Industry

137 - Suresh Kumar , Qaisar Imtiaz , Sarmad Mahar 2021

Software and IT industry in Pakistan have seen a dramatic growth and success in past few years and is expected to get doubled by 2020, according to a research. Software development life cycle comprises of multiple phases, activities and techniques that can lead to successful projects, and software evaluation is one of the vital and important parts of that. Software estimation can alone be the reason of product success factor or the products failure factor. To estimate the right cost, effort and resources is an art. But it is also very important to include the risks that may arise in the in a software project which can affect your estimates. In this paper, we highlight how the risks in Pakistan Software Industry can affect the estimates and how to mitigate them.

Software Engineering

comments

Fetching comments

Peninsula Private University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Interactive Duplicate Search in Software Documentation

Ask ChatGPT about the research

No Arabic abstract

Read More