Do you want to publish a course? Click here

Summary Analysis of the 2017 GitHub Open Source Survey

89   0   0.0 ( 0 )
 Added by R.Stuart Geiger
 Publication date 2017
and research's language is English




Ask ChatGPT about the research

This report is a high-level summary analysis of the 2017 GitHub Open Source Survey dataset, presenting frequency counts, proportions, and frequency or proportion bar plots for every question asked in the survey.



rate research

Read More

GitHub has become the central online platform for much of open source, hosting most open source code repositories. With this popularity, the public digital traces of GitHub are now a valuable means to study teamwork and collaboration. In many ways, however, GitHub is a convenience sample. We need to assess its representativeness, particularly how GitHubs design may alter the working patterns of its users. Here we develop a novel, extensive sample of public open source project repositories outside of centralized platforms like GitHub. We characterized these projects along a number of dimensions, and compare to a time-matched sample of corresponding GitHub projects. Compared to GitHub, these projects tend to have more collaborators, are maintained for longer periods, and tend to be more focused on academic and scientific problems.
Computational research and data analytics increasingly relies on complex ecosystems of open source software (OSS) libraries -- curated collections of reusable code that programmers import to perform a specific task. Software documentation for these libraries is crucial in helping programmers/analysts know what libraries are available and how to use them. Yet documentation for open source software libraries is widely considered low-quality. This article is a collaboration between CSCW researchers and contributors to data analytics OSS libraries, based on ethnographic fieldwork and qualitative interviews. We examine several issues around the formats, practices, and challenges around documentation in these largely volunteer-based projects. There are many different kinds and formats of documentation that exist around such libraries, which play a variety of educational, promotional, and organizational roles. The work behind documentation is similarly multifaceted, including writing, reviewing, maintaining, and organizing documentation. Different aspects of documentation work require contributors to have different sets of skills and overcome various social and technical barriers. Finally, most of our interviewees do not report high levels of intrinsic enjoyment for doing documentation work (compared to writing code). Their motivation is affected by personal and project-specific factors, such as the perceived level of credit for doing documentation work versus more technical tasks like adding new features or fixing bugs. In studying documentation work for data analytics OSS libraries, we gain a new window into the changing practices of data-intensive research, as well as help practitioners better understand how to support this often invisible and infrastructural work in their projects.
Desktop GIS applications, such as ArcGIS and QGIS, provide tools essential for conducting suitability analysis, an activity that is central in formulating a land-use plan. But, when it comes to building complicated land-use suitability models, these applications have several limitations, including operating system-dependence, lack of dedicated modules, insufficient reproducibility, and difficult, if not impossible, deployment on a computing cluster. To address the challenges, this paper introduces PyLUSAT: Python for Land Use Suitability Analysis Tools. PyLUSAT is an open-source software package that provides a series of tools (functions) to conduct various tasks in a suitability modeling workflow. These tools were evaluated against comparable tools in ArcMap 10.4 with respect to both accuracy and computational efficiency. Results showed that PyLUSAT functions were two to ten times more efficient depending on the jobs complexity, while generating outputs with similar accuracy compared to the ArcMap tools. PyLUSAT also features extensibility and cross-platform compatibility. It has been used to develop fourteen QGIS Processing Algorithms and implemented on a high-performance computational cluster (HiPerGator at the University of Florida) to expedite the process of suitability analysis. All these properties make PyLUSAT a competitive alternative solution for urban planners/researchers to customize and automate suitability analysis as well as integrate the technique into a larger analytical framework.
Standardisation is an important component in the maturation of any field of technology. It contributes to the formation of a recognisable identity and enables interactions with a wider community. This article reviews past and current standardisation initiatives in the field of Open Source Hardware (OSH). While early initiatives focused on aspects such as licencing, intellectual property and documentation formats, recent efforts extend to ways for users to exercise their rights under open licences and to keep OSH projects discoverable and accessible online. We specifically introduce two standards that are currently being released and call for early users and contributors, the DIN SPEC 3105 and the Open Know How Manifest Specification. Finally, we reflect on challenges around standardisation in the community and relevant areas for future development such as an open tool chain, modularity and hardware specific interface standards.
Software testing is one of the very important Quality Assurance (QA) components. A lot of researchers deal with the testing process in terms of tester motivation and how tests should or should not be written. However, it is not known from the recommendations how the tests are written in real projects. In this paper, the following was investigated: (i) the denotation of the word test in different natural languages; (ii) whether the number of occurrences of the word test correlates with the number of test cases; and (iii) what testing frameworks are mostly used. The analysis was performed on 38 GitHub open source repositories thoroughly selected from the set of 4.3M GitHub projects. We analyzed 20,340 test cases in 803 classes manually and 170k classes using an automated approach. The results show that: (i) there exists a weak correlation (r = 0.655) between the number of occurrences of the word test and the number of test cases in a class; (ii) the proposed algorithm using static file analysis correctly detected 97% of test cases; (iii) 15% of the analyzed classes used main() function whose represent regular Java programs that test the production code without using any third-party framework. The identification of such tests is very complex due to implementation diversity. The results may be leveraged to more quickly identify and locate test cases in a repository, to understand practices in customized testing solutions, and to mine tests to improve program comprehension in the future.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا