Teaching Responsible Data Science: Charting New Pedagogical Territory


Abstract in English

Although numerous ethics courses are available, with many focusing specifically on technology and computer ethics, pedagogical approaches employed in these courses rely exclusively on texts rather than on software development or data analysis. Technical students often consider these courses unimportant and a distraction from the real material. To develop instructional materials and methodologies that are thoughtful and engaging, we must strive for balance: between texts and coding, between critique and solution, and between cutting-edge research and practical applicability. Finding such balance is particularly difficult in the nascent field of responsible data science (RDS), where we are only starting to understand how to interface between the intrinsically different methodologies of engineering and social sciences. In this paper we recount a recent experience in developing and teaching an RDS course to graduate and advanced undergraduate students in data science. We then dive into an area that is critically important to RDS -- transparency and interpretability of machine-assisted decision-making, and tie this area to the needs of emerging RDS curricula. Recounting our own experience, and leveraging literature on pedagogical methods in data science and beyond, we propose the notion of an object-to-interpret-with. We link this notion to nutritional labels -- a family of interpretability tools that are gaining popularity in RDS research and practice. With this work we aim to contribute to the nascent area of RDS education, and to inspire others in the community to come together to develop a deeper theoretical understanding of the pedagogical needs of RDS, and contribute concrete educational materials and methodologies that others can use. All course materials are publicly available at https://dataresponsibly.github.io/courses.

Download