ترغب بنشر مسار تعليمي؟ اضغط هنا

AutoDS: Towards Human-Centered Automation of Data Science

175   0   0.0 ( 0 )
 نشر من قبل Dakuo Wang
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Data science (DS) projects often follow a lifecycle that consists of laborious tasks for data scientists and domain experts (e.g., data exploration, model training, etc.). Only till recently, machine learning(ML) researchers have developed promising automation techniques to aid data workers in these tasks. This paper introduces AutoDS, an automated machine learning (AutoML) system that aims to leverage the latest ML automation techniques to support data science projects. Data workers only need to upload their dataset, then the system can automatically suggest ML configurations, preprocess data, select algorithm, and train the model. These suggestions are presented to the user via a web-based graphical user interface and a notebook-based programming user interface. We studied AutoDS with 30 professional data scientists, where one group used AutoDS, and the other did not, to complete a data science project. As expected, AutoDS improves productivity; Yet surprisingly, we find that the models produced by the AutoDS group have higher quality and less errors, but lower human confidence scores. We reflect on the findings by presenting design implications for incorporating automation techniques into human work in the data science lifecycle.



قيم البحث

اقرأ أيضاً

Building models from data is an integral part of the majority of data science workflows. While data scientists are often forced to spend the majority of the time available for a given project on data cleaning and exploratory analysis, the time availa ble to practitioners to build actual models from data is often rather short due to time constraints for a given project. AutoML systems are currently rising in popularity, as they can build powerful models without human oversight. In this position paper, we aim to discuss the impact of the rising popularity of such systems and how a user-centered interface for such systems could look like. More importantly, we also want to point out features that are currently missing in those systems and start to explore better usability of such systems from a data-scientists perspective.
The rapid advancement of artificial intelligence (AI) is changing our lives in many ways. One application domain is data science. New techniques in automating the creation of AI, known as AutoAI or AutoML, aim to automate the work practices of data s cientists. AutoAI systems are capable of autonomously ingesting and pre-processing data, engineering new features, and creating and scoring models based on a target objectives (e.g. accuracy or run-time efficiency). Though not yet widely adopted, we are interested in understanding how AutoAI will impact the practice of data science. We conducted interviews with 20 data scientists who work at a large, multinational technology company and practice data science in various business settings. Our goal is to understand their current work practices and how these practices might change with AutoAI. Reactions were mixed: while informants expressed concerns about the trend of automating their jobs, they also strongly felt it was inevitable. Despite these concerns, they remained optimistic about their future job security due to a view that the future of data science work will be a collaboration between humans and AI systems, in which both automation and human expertise are indispensable.
481 - Ziyao Zhou , Chen Chai , Weiru Yin 2021
The purpose of this paper is to develop a shared control takeover strategy for smooth and safety control transition from an automation driving system to the human driver and to approve its positive impacts on drivers behavior and attitudes. A human-i n-the-loop driving simulator experiment was conducted to evaluate the impact of the proposed shared control takeover strategy under different disengagement conditions. Results of thirty-two drivers showed shared control takeover strategy could improve safety performance at the aggregated level, especially at non-driving related disengagements. For more urgent disengagements caused by another vehicles sudden brake, a shared control strategy enlarges individual differences. The primary reason is that some drivers had higher self-reported mental workloads in response to the shared control takeover strategy. Therefore, shared control between driver and automation can involve drivers training to avoid mental overload when developing takeover strategies.
Machine learning (ML) is increasingly being used in image retrieval systems for medical decision making. One application of ML is to retrieve visually similar medical images from past patients (e.g. tissue from biopsies) to reference when making a me dical decision with a new patient. However, no algorithm can perfectly capture an experts ideal notion of similarity for every case: an image that is algorithmically determined to be similar may not be medically relevant to a doctors specific diagnostic needs. In this paper, we identified the needs of pathologists when searching for similar images retrieved using a deep learning algorithm, and developed tools that empower users to cope with the search algorithm on-the-fly, communicating what types of similarity are most important at different moments in time. In two evaluations with pathologists, we found that these refinement tools increased the diagnostic utility of images found and increased user trust in the algorithm. The tools were preferred over a traditional interface, without a loss in diagnostic accuracy. We also observed that users adopted new strategies when using refinement tools, re-purposing them to test and understand the underlying algorithm and to disambiguate ML errors from their own errors. Taken together, these findings inform future human-ML collaborative systems for expert decision-making.
Efforts to make machine learning more widely accessible have led to a rapid increase in Auto-ML tools that aim to automate the process of training and deploying machine learning. To understand how Auto-ML tools are used in practice today, we performe d a qualitative study with participants ranging from novice hobbyists to industry researchers who use Auto-ML tools. We present insights into the benefits and deficiencies of existing tools, as well as the respective roles of the human and automation in ML workflows. Finally, we discuss design implications for the future of Auto-ML tool development. We argue that instead of full automation being the ultimate goal of Auto-ML, designers of these tools should focus on supporting a partnership between the user and the Auto-ML tool. This means that a range of Auto-ML tools will need to be developed to support varying user goals such as simplicity, reproducibility, and reliability.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا