ﻻ يوجد ملخص باللغة العربية
Background: Meeting the growing industry demand for Data Science requires cross-disciplinary teams that can translate machine learning research into production-ready code. Software engineering teams value adherence to coding standards as an indication of code readability, maintainability, and developer expertise. However, there are no large-scale empirical studies of coding standards focused specifically on Data Science projects. Aims: This study investigates the extent to which Data Science projects follow code standards. In particular, which standards are followed, which are ignored, and how does this differ to traditional software projects? Method: We compare a corpus of 1048 Open-Source Data Science projects to a reference group of 1099 non-Data Science projects with a similar level of quality and maturity. Results: Data Science projects suffer from a significantly higher rate of functions that use an excessive numbers of parameters and local variables. Data Science projects also follow different variable naming conventions to non-Data Science projects. Conclusions: The differences indicate that Data Science codebases are distinct from traditional software codebases and do not follow traditional software engineering conventions. Our conjecture is that this may be because traditional software engineering conventions are inappropriate in the context of Data Science projects.
We want to analyze visually, to what extend team members and external developers contribute to open-source projects. This gives a high-level impression about collaboration in that projects. We achieve this by recording provenance of the development p
Quantum computing (QC) is an emerging computing paradigm with potential to revolutionize the field of computing. QC is a field that is quickly developing globally and has high barriers of entry. In this paper we explore both successful contributors t
The development of scientific software is, more than ever, critical to the practice of science, and this is accompanied by a trend towards more open and collaborative efforts. Unfortunately, there has been little investigation into who is driving the
Communication is essential in software engineering. Especially in distributed open-source teams, communication needs to be supported by channels including mailing lists, forums, issue trackers, and chat systems. Yet, we do not have a clear understand
Software testing is one of the very important Quality Assurance (QA) components. A lot of researchers deal with the testing process in terms of tester motivation and how tests should or should not be written. However, it is not known from the recomme