No Arabic abstract
The HANDE quantum Monte Carlo project offers accessible stochastic algorithms for general use for scientists in the field of quantum chemistry. HANDE is an ambitious and general high-performance code developed by a geographically-dispersed team with a variety of backgrounds in computational science. In the course of preparing a public, open-source release, we have taken this opportunity to step back and look at what we have done and what we hope to do in the future. We pay particular attention to development processes, the approach taken to train students joining the project, and how a flat hierarchical structure aids communication
The development of scientific software is, more than ever, critical to the practice of science, and this is accompanied by a trend towards more open and collaborative efforts. Unfortunately, there has been little investigation into who is driving the evolution of such scientific software or how the collaboration happens. In this paper, we address this problem. We present an extensive analysis of seven open-source scientific software projects in order to develop an empirically-informed model of the development process. This analysis was complemented by a survey of 72 scientific software developers. In the majority of the projects, we found senior research staff (e.g. professors) to be responsible for half or more of commits (an average commit share of 72%) and heavily involved in architectural concerns (seniors were more likely to interact with files related to the build system, project meta-data, and developer documentation). Juniors (e.g.graduate students) also contribute substantially -- in one studied project, juniors made almost 100% of its commits. Still, graduate students had the longest contribution periods among juniors (with 1.72 years of commit activity compared to 0.98 years for postdocs and 4 months for undergraduates). Moreover, we also found that third-party contributors are scarce, contributing for just one day for the project. The results from this study aim to help scientists to better understand their own projects, communities, and the contributors behavior, while paving the road for future software engineering research
A novel modeling framework is proposed for dynamic scheduling of projects and workforce assignment in open source software development (OSSD). The goal is to help project managers in OSSD distribute workforce to multiple projects to achieve high efficiency in software development (e.g. high workforce utilization and short development time) while ensuring the quality of deliverables (e.g. code modularity and software security). The proposed framework consists of two models: 1) a system dynamic model coupled with a meta-heuristic to obtain an optimal schedule of software development projects considering their attributes (e.g. priority, effort, duration) and 2) an agent based model to represent the development community as a social network, where development managers form an optimal team for each project and balance the workload among multiple scheduled projects based on the optimal schedule obtained from the system dynamic model. To illustrate the proposed framework, a software enhancement request process in Kuali foundation is used as a case study. Survey data collected from the Kuali development managers, project managers and actual historical enhancement requests have been used to construct the proposed models. Extensive experiments are conducted to demonstrate the impact of varying parameters on the considered efficiency and quality.
Managing and growing a successful cyberinfrastructure such as nanoHUB.org presents a variety of opportunities and challenges, particularly in regard to software. This position paper details a number of those issues and how we have approached them.
Open source development, to a great extent, is a type of social movement in which shared ideologies play critical roles. For participants of open source development, ideology determines how they make sense of things, shapes their thoughts, actions, and interactions, enables rich social dynamics in their projects and communities, and hereby realizes profound impacts at both individual and organizational levels. While software engineering researchers have been increasingly recognizing ideologys importance in open source development, the notion of ideology has shown significant ambiguity and vagueness, and resulted in theoretical and empirical confusion. In this article, we first examine the historical development of ideologys conceptualization, and its theories in multiple disciplines. Then, we review the extant software engineering literature related to ideology. We further argue the imperatives of developing an empirical theory of ideology in open source development, and propose a research agenda for developing such a theory. How such a theory could be applied is also discussed.
Building on the success of Quantum Monte Carlo techniques such as diffusion Monte Carlo, alternative stochastic approaches to solve electronic structure problems have emerged over the last decade. The full configuration interaction quantum Monte Carlo (FCIQMC) method allows one to systematically approach the exact solution of such problems, for cases where very high accuracy is desired. The introduction of FCIQMC has subsequently led to the development of coupled cluster Monte Carlo (CCMC) and density matrix quantum Monte Carlo (DMQMC), allowing stochastic sampling of the coupled cluster wave function and the exact thermal density matrix, respectively. In this article we describe the HANDE-QMC code, an open-source implementation of FCIQMC, CCMC and DMQMC, including initiator and semi-stochastic adaptations. We describe our code and demonstrate its use on three example systems; a molecule (nitric oxide), a model solid (the uniform electron gas), and a real solid (diamond). An illustrative tutorial is also included.