ﻻ يوجد ملخص باللغة العربية
Developers in data science and other domains frequently use computational notebooks to create exploratory analyses and prototype models. However, they often struggle to incorporate existing software engineering tooling into these notebook-based workflows, leading to fragile development processes. We introduce Assembl{e}, a new development environment for collaborative data science projects, in which promising code fragments of data science pipelines can be contributed as pull requests to an upstream repository entirely from within JupyterLab, abstracting away low-level version control tool usage. We describe the design and implementation of Assembl{e} and report on a user study of 23 data scientists.
Open Science, Reproducible Research, Findable, Accessible, Interoperable and Reusable (FAIR) data principles are long term goals for scientific dissemination. However, the implementation of these principles calls for a reinspection of our means of di
Computational notebooks have emerged as the platform of choice for data science and analytical workflows, enabling rapid iteration and exploration. By keeping intermediate program state in memory and segmenting units of execution into so-called cells
The Notebook validation tool nbval allows to load and execute Python code from a Jupyter notebook file. While computing outputs from the cells in the notebook, these outputs are compared with the outputs saved in the notebook file, treating each cell
Short-form digital storytelling has become a popular medium for millions of people to express themselves. Traditionally, this medium uses primarily 2D media such as text (e.g., memes), images (e.g., Instagram), gifs (e.g., Giphy), and videos (e.g., T
The rapid advancement of artificial intelligence (AI) is changing our lives in many ways. One application domain is data science. New techniques in automating the creation of AI, known as AutoAI or AutoML, aim to automate the work practices of data s