No Arabic abstract
Many methods have been proposed to estimate how much effort is required to build and maintain software. Much of that research assumes a ``classic waterfall-based approach rather than contemporary projects (where the developing process may be more iterative than linear in nature). Also, much of that work tries to recommend a single method-- an approach that makes the dubious assumption that one method can handle the diversity of software project data. To address these drawbacks, we apply a configuration technique called ``ROME (Rapid Optimizing Methods for Estimation), which uses sequential model-based optimization (SMO) to find what combination of effort estimation techniques works best for a particular data set. We test this method using data from 1161 classic waterfall projects and 120 contemporary projects (from Github). In terms of magnitude of relative error and standardized accuracy, we find that ROME achieves better performance than existing state-of-the-art methods for both classic and contemporary problems. In addition, we conclude that we should not recommend one method for estimation. Rather, it is better to search through a wide range of different methods to find what works best for local data. To the best of our knowledge, this is the largest effort estimation experiment yet attempted and the only one to test its methods on classic and contemporary projects.
In the domain of software engineering, our efforts as researchers to advise industry on which software practices might be applied most effectively are limited by our lack of evidence based information about the relationships between context and practice efficacy. In order to accumulate such evidence, a model for context is required. We are in the exploratory stage of evolving a model for context for situated software practices. In this paper, we overview the evolution of our proposed model. Our analysis has exposed a lack of clarity in the meanings of terms reported in the literature. Our base model dimensions are People, Place, Product and Process. Our contributions are a deepening of our understanding of how to scope contextual factors when considering software initiatives and the proposal of an initial theoretical construct for context. Study limitations relate to a possible subjectivity in the analysis and a restricted evaluation base. In the next stage in the research, we will collaborate with academics and practitioners to formally refine the model.
It is widely acknowledged by researchers and practitioners that software development methodologies are generally adapted to suit specific project contexts. Research into practices-as-implemented has been fragmented and has tended to focus either on the strength of adherence to a specific methodology or on how the efficacy of specific practices is affected by contextual factors. We submit the need for a more holistic, integrated approach to investigating context-related best practice. We propose a six-dimensional model of the problem-space, with dimensions organisational drivers (why), space and time (where), culture (who), product life-cycle stage (when), product constraints (what) and engagement constraints (how). We test our model by using it to describe and explain a reported implementation study. Our contributions are a novel approach to understanding situated software practices and a preliminary model for software contexts.
How to make software analytics simpler and faster? One method is to match the complexity of analysis to the intrinsic complexity of the data being explored. For example, hyperparameter optimizers find the control settings for data miners that improve for improving the predictions generated via software analytics. Sometimes, very fast hyperparameter optimization can be achieved by just DODGE-ing away from things tried before. But when is it wise to use DODGE and when must we use more complex (and much slower) optimizers? To answer this, we applied hyperparameter optimization to 120 SE data sets that explored bad smell detection, predicting Github ssue close time, bug report analysis, defect prediction, and dozens of other non-SE problems. We find that DODGE works best for data sets with low intrinsic dimensionality (D = 3) and very poorly for higher-dimensional data (D over 8). Nearly all the SE data seen here was intrinsically low-dimensional, indicating that DODGE is applicable for many SE analytics tasks.
Software Repositories contain knowledge on how software engineering teams work, communicate, and collaborate. It can be used to develop a data-informed view of a teams development process, which in turn can be employed for process improvement initiatives. In modern, Agile development methods, process improvement takes place in Retrospective meetings, in which the last development iteration is discussed. However, previously proposed activities that take place in these meetings often do not rely on project data, instead depending solely on the perceptions of team members. We propose new Retrospective activities, based on mining the software repositories of individual teams, to complement existing approaches with more objective, data-informed process views.
Software process improvement (SPI) is a means to an end, not an end in itself (e.g., a goal is to achieve shorter time to market and not just compliance to a process standard). Therefore, SPI initiatives ought to be streamlined to meet the desired values for an organization. Through a literature review, seven secondary studies aggregating maturity models and assessment frameworks were identified. Furthermore, we identified six proposals for building a new maturity model. We analyzed the existing maturity models for (a) their purpose, structure, guidelines, and (b) the degree to which they explicitly consider values and benefits. Based on this analysis and utilizing the guidelines from the proposals to build maturity models, we have introduced an approach for developing a value-driven approach for SPI. The proposal leveraged the benefits-dependency networks. We argue that our approach enables the following key benefits: (a) as a value-driven approach, it streamlines value-delivery and helps to avoid unnecessary process interventions, (b) as a knowledge-repository, it helps to codify lessons learned i.e. whether adopted practices lead to value realization, and (c) as an internal process maturity assessment tool, it tracks the progress of process realization, which is necessary to monitor progress towards the intended values.