No Arabic abstract
Robotic Process Automation (RPA) is a technology to automate routine work such as copying data across applications or filling in document templates using data from multiple applications. RPA tools allow organizations to automate a wide range of routines. However, identifying and scoping routines that can be automated using RPA tools is time consuming. Manual identification of candidate routines via interviews, walk-throughs, or job shadowing allow analysts to identify the most visible routines, but these methods are not suitable when it comes to identifying the long tail of routines in an organization. This article proposes an approach to discover automatable routines from logs of user interactions with IT systems and to synthesize executable specifications for such routines. The approach starts by discovering frequent routines at a control-flow level (candidate routines). It then determines which of these candidate routines are automatable and it synthetizes an executable specification for each such routine. Finally, it identifies semantically equivalent routines so as to produce a set of non-redundant automatable routines. The article reports on an evaluation of the approach using a combination of synthetic and real-life logs. The evaluation results show that the approach can discover automatable routines that are known to be present in a UI log, and that it identifies automatable routines that users recognize as such in real-life logs.
Deriving formal specifications from informal requirements is difficult since one has to take into account the disparate conceptual worlds of the application domain and of software development. To bridge the conceptual gap we propose controlled natural language as a textual view on formal specifications in logic. The specification language Attempto Controlled English (ACE) is a subset of natural language that can be accurately and efficiently processed by a computer, but is expressive enough to allow natural usage. The Attempto system translates specifications in ACE into discourse representation structures and into Prolog. The resulting knowledge base can be queried in ACE for verification, and it can be executed for simulation, prototyping and validation of the specification.
In this paper the reversibility of executable Interval Temporal Logic (ITL) specifications is investigated. ITL allows for the reasoning about systems in terms of behaviours which are represented as non-empty sequences of states. It allows for the specification of systems at different levels of abstraction. At a high level this specification is in terms of properties, for instance safety and liveness properties. At concrete level one can specify a system in terms of programming constructs. One can execute these concrete specification, i.e., test and simulate the behaviour of the system. In this paper we will formalise this notion of executability of ITL specifications. ITL also has a reflection operator which allows for the reasoning about reversed behaviours. We will investigate the reversibility of executable ITL specifications, i.e., how one can use this reflection operator to reverse the concrete behaviour of a particular system.
Behavioral software models play a key role in many software engineering tasks; unfortunately, these models either are not available during software development or, if available, they quickly become outdated as the implementations evolve. Model inference techniques have been proposed as a viable solution to extract finite-state models from execution logs. However, existing techniques do not scale well when processing very large logs, such as system-level logs obtained by combining component-level logs. Furthermore, in the case of component-based systems, existing techniques assume to know the definitions of communication channels between components. However, this information is usually not available in the case of systems integrating 3rd-party components with limited documentation. In this paper, we address the scalability problem of inferring the model of a component-based system from the individual component-level logs, when the only available information about the system are high-level architecture dependencies among components and a (possibly incomplete) list of log message templates denoting communication events between components. Our model inference technique, called SCALER, follows a divide and conquer approach. The idea is to first infer a model of each system component from the corresponding logs; then, the individual component models are merged together taking into account the dependencies among components, as reflected in the logs. We evaluated SCALER in terms of scalability and accuracy, using a dataset of logs from an industrial system; the results show that SCALER can process much larger logs than a state-of-the-art tool, while yielding more accurate models.
Web API specifications are machine-readable descriptions of APIs. These specifications, in combination with related tooling, simplify and support the consumption of APIs. However, despite the increased distribution of web APIs, specifications are rare and their creation and maintenance heavily relies on manual efforts by third parties. In this paper, we propose an automatic approach and an associated tool called D2Spec for extracting specifications from web API documentation pages. Given a seed online documentation page on an API, D2Spec first crawls all documentation pages on the API, and then uses a set of machine learning techniques to extract the base URL, path templates, and HTTP methods, which collectively describe the endpoints of an API. We evaluated whether D2Spec can accurately extract endpoints from documentation on 120 web APIs. The results showed that D2Spec achieved a precision of 87.5% in identifying base URLs, a precision of 81.3% and a recall of 80.6% in generating path templates, and a precision of 84.4% and a recall of 76.2% in extracting HTTP methods. In addition, we found that D2Spec was useful when applied to APIs with pre-existing API specifications: D2Spec revealed many inconsistencies between web API documentation and their corresponding publicly available specifications. Thus, D2Spec can be used by web API providers to keep documentation and specifications in synchronization.
The FermaT transformation system, based on research carried out over the last sixteen years at Durham University, De Montfort University and Software Migrations Ltd., is an industrial-strength formal transformation engine with many applications in program comprehension and language migration. This paper is a case study which uses automated plus manually-directed transformations and abstractions to convert an IBM 370 Assembler code program into a very high-level abstract specification.