No Arabic abstract
Software systems are increasingly depending on data, particularly with the rising use of machine learning, and developers are looking for new sources of data. Open Data Ecosystems (ODE) is an emerging concept for data sharing under public licenses in software ecosystems, similar to Open Source Software (OSS). It has certain similarities to Open Government Data (OGD), where public agencies share data for innovation and transparency. We aimed to explore open data ecosystems involving commercial actors. Thus, we organized five focus groups with 27 practitioners from 22 companies, public organizations, and research institutes. Based on the outcomes, we surveyed three cases of emerging ODE practice to further understand the concepts and to validate the initial findings. The main outcome is an initial conceptual model of ODEs value, intrinsics, governance, and evolution, and propositions for practice and further research. We found that ODE must be value driven. Regarding the intrinsics of data, we found their type, meta-data, and legal frameworks influential for their openness. We also found the characteristics of ecosystem initiation, organization, data acquisition and openness be differentiating, which we advise research and practice to take into consideration.
In globally distributed projects, virtual teams are often partially dispersed. One common setup occurs when several members from one company work with a large outsourcing vendor based in another country. Further, the introduction of the popular BizDevOps concept has increased the necessity to cooperate across departments and reduce the age-old disconnection between the business strategy and technical development. Establishing a good collaboration in partially distributed BizDevOps teams requires extensive collaboration and communication techniques. Nowadays, a common approach is to rely on collaboration through pull requests and frequent communication on Slack. To investigate barriers for pull requests in distributed teams, we examined an organization located in Scandinavia where cross-functional BizDevOps teams collaborated with off-site team members in India. Data were collected by conducting 14 interviews, observing 23 entire days with the team, and observing 37 meetings. We found that the pull-request approach worked very well locally but not across sites. We found barriers such as domain complexity, different agile processes (timeboxed vs. flow-based development), and employee turnover. Using an intellectual capital lens on our findings, we discuss barriers and positive and negative effects on the success of the pull-request approach.
Autonomous driving shows great potential to reform modern transportation and its safety is attracting much attention from public. Autonomous driving systems generally include deep neural networks (DNNs) for gaining better performance (e.g., accuracy on object detection and trajectory prediction). However, compared with traditional software systems, this new paradigm (i.e., program + DNNs) makes software testing more difficult. Recently, software engineering community spent significant effort in developing new testing methods for autonomous driving systems. However, it is not clear that what extent those testing methods have addressed the needs of industrial practitioners of autonomous driving. To fill this gap, in this paper, we present the first comprehensive study to identify the current practices and needs of testing autonomous driving systems in industry. We conducted semi-structured interviews with developers from 10 autonomous driving companies and surveyed 100 developers who have worked on autonomous driving systems. Through thematic analysis of interview and questionnaire data, we identified five urgent needs of testing autonomous driving systems from industry. We further analyzed the limitations of existing testing methods to address those needs and proposed several future directions for software testing researchers.
Software systems are designed according to guidelines and constraints defined by business rules. Some of these constraints define the allowable or required values for data handled by the systems. These data constraints usually originate from the problem domain (e.g., regulations), and developers must write code that enforces them. Understanding how data constraints are implemented is essential for testing, debugging, and software change. Unfortunately, there are no widely-accepted guidelines or best practices on how to implement data constraints. This paper presents an empirical study that investigates how data constraints are implemented in Java. We study the implementation of 187 data constraints extracted from the documentation of eight real-world Java software systems. First, we perform a qualitative analysis of the textual description of data constraints and identify four data constraint types. Second, we manually identify the implementations of these data constraints and reveal that they can be grouped into 30 implementation patterns. The analysis of these implementation patterns indicates that developers prefer a handful of patterns when implementing data constraints and deviations from these patterns are associated with unusual implementation decisions or code smells. Third, we develop a tool-assisted protocol that allows us to identify 256 additional trace links for the data constraints implemented using the 13 most common patterns. We find that almost half of these data constraints have multiple enforcing statements, which are code clones of different types.
Despite the sophisticated phishing email detection systems, and training and awareness programs, humans continue to be tricked by phishing emails. In an attempt to understand why phishing email attacks still work, we have carried out an empirical study to investigate how people make response decisions while reading their emails. We used a think aloud method and follow-up interviews to collect data from 19 participants. The analysis of the collected data has enabled us to identify eleven factors that influence peoples response decisions to both phishing and legitimate emails. Based on the identified factors, we discuss how people can be susceptible to phishing attacks due to the flaws in their decision-making processes. Furthermore, we propose design directions for developing a behavioral plugin for email clients that can be used to nudge peoples secure behaviors enabling them to have a better response to phishing emails.
This paper is an investigation into aspects of an audio classification pipeline that will be appropriate for the monitoring of bird species on edges devices. These aspects include transfer learning, data augmentation and model optimization. The hope is that the resulting models will be good candidates to deploy on edge devices to monitor bird populations. Two classification approaches will be taken into consideration, one which explores the effectiveness of a traditional Deep Neural Network(DNN) and another that makes use of Convolutional layers.This study aims to contribute empirical evidence of the merits and demerits of each approach.