New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Which contributions count? Analysis of attribution in open source

170 0 0.0 ( 0 )

Download Cite

Added by Jean-Gabriel Young

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Jean-Gabriel Young - Amanda Casari - Katie McLaughlin

Software Engineering Computers and Society

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Open source software projects usually acknowledge contributions with text files, websites, and other idiosyncratic methods. These data sources are hard to mine, which is why contributorship is most frequently measured through changes to repositories, such as commits, pushes, or patches. Recently, some open source projects have taken to recording contributor actions with standardized systems; this opens up a unique opportunity to understand how community-generated notions of contributorship map onto codebases as the measure of contribution. Here, we characterize contributor acknowledgment models in open source by analyzing thousands of projects that use a model called All Contributors to acknowledge diverse contributions like outreach, finance, infrastructure, and community management. We analyze the life cycle of projects through this models lens and contrast its representation of contributorship with the picture given by other methods of acknowledgment, including GitHubs top committers indicator and contributions derived from actions taken on the platform. We find that community-generated systems of contribution acknowledgment make work like idea generation or bug finding more visible, which generates a more extensive picture of collaboration. Further, we find that models requiring explicit attribution lead to more clearly defined boundaries around what is and what is not a contribution.

rate research

Visualization of Contributions to Open-Source Projects

74 - Andreas Schreiber 2020

We want to analyze visually, to what extend team members and external developers contribute to open-source projects. This gives a high-level impression about collaboration in that projects. We achieve this by recording provenance of the development process and use graph drawing on the resulting provenance graph. Our graph drawings show, which developers are jointly changed the same files -- and to what extent -- which we show at Germanys COVID-19 exposure notification app Corona-Warn-App.

Software Engineering Information Retrieval

Ideology in Open Source Development

116 - Yang Yue , Xiaoran Yu , Xinyi You 2021

Open source development, to a great extent, is a type of social movement in which shared ideologies play critical roles. For participants of open source development, ideology determines how they make sense of things, shapes their thoughts, actions, and interactions, enables rich social dynamics in their projects and communities, and hereby realizes profound impacts at both individual and organizational levels. While software engineering researchers have been increasingly recognizing ideologys importance in open source development, the notion of ideology has shown significant ambiguity and vagueness, and resulted in theoretical and empirical confusion. In this article, we first examine the historical development of ideologys conceptualization, and its theories in multiple disciplines. Then, we review the extant software engineering literature related to ideology. We further argue the imperatives of developing an empirical theory of ideology in open source development, and propose a research agenda for developing such a theory. How such a theory could be applied is also discussed.

Software Engineering

Authorship Attribution of Source Code: A Language-Agnostic Approach and Applicability in Software Engineering

62 - Egor Bogomolov (JetBrains Research , Higher School of Economics , n Saint Petersburg 2020

Authorship attribution (i.e., determining who is the author of a piece of source code) is an established research topic. State-of-the-art results for the authorship attribution problem look promising for the software engineering field, where they could be applied to detect plagiarized code and prevent legal issues. With this article, we first introduce a new language-agnostic approach to authorship attribution of source code. Then, we discuss limitations of existing synthetic datasets for authorship attribution, and propose a data collection approach that delivers datasets that better reflect aspects important for potential practical use in software engineering. Finally, we demonstrate that high accuracy of authorship attribution models on existing datasets drastically drops when they are evaluated on more realistic data. We outline next steps for the design and evaluation of authorship attribution models that could bring the research efforts closer to practical use for software engineering.

Software Engineering

A large-scale comparative analysis of Coding Standard conformance in Open-Source Data Science projects

118 - Andrew J. Simmons , Scott Barnett , Jessica Rivera-Villicana 2020

Background: Meeting the growing industry demand for Data Science requires cross-disciplinary teams that can translate machine learning research into production-ready code. Software engineering teams value adherence to coding standards as an indication of code readability, maintainability, and developer expertise. However, there are no large-scale empirical studies of coding standards focused specifically on Data Science projects. Aims: This study investigates the extent to which Data Science projects follow code standards. In particular, which standards are followed, which are ignored, and how does this differ to traditional software projects? Method: We compare a corpus of 1048 Open-Source Data Science projects to a reference group of 1099 non-Data Science projects with a similar level of quality and maturity. Results: Data Science projects suffer from a significantly higher rate of functions that use an excessive numbers of parameters and local variables. Data Science projects also follow different variable naming conventions to non-Data Science projects. Conclusions: The differences indicate that Data Science codebases are distinct from traditional software codebases and do not follow traditional software engineering conventions. Our conjecture is that this may be because traditional software engineering conventions are inappropriate in the context of Data Science projects.

Software Engineering

Making Quantum Computing Open: Lessons from Open-Source Projects

76 - Ruslan Shaydulin , Caleb Thomas , Paige Rodeghero 2019

Quantum computing (QC) is an emerging computing paradigm with potential to revolutionize the field of computing. QC is a field that is quickly developing globally and has high barriers of entry. In this paper we explore both successful contributors to the field as well as wider QC community with the goal of understanding the backgrounds and training that helped them succeed. We gather data on 148 contributors to open-source quantum computing projects hosted on GitHub and survey 46 members of QC community. Our findings show that QC practitioners and enthusiasts have diverse backgrounds, with most of them having a PhD and trained in physics or computer science. We observe a lack of educational resources on quantum computing. Our goal for these findings is to start a conversation about how best to prepare the next generation of QC researchers and practitioners.

Software Engineering

comments

Fetching comments

Arab Academy for Science and Technology and Maritime Transport

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Which contributions count? Analysis of attribution in open source

Ask ChatGPT about the research

No Arabic abstract

Read More