No Arabic abstract
In many academic settings, medical students start their scientific work already during their studies. Like at our institution, they often work in interdisciplinary teams with more or less experienced (postgraduate) researchers of pharmaceutical sciences, natural sciences in general, or biostatistics. All of them should be taught good research practices as an integral part of their education, especially in terms of statistical analysis. This includes reproducibility as a central aspect of modern research. Acknowledging that even educators might be unfamiliar with necessary aspects of a perfectly reproducible workflow, I agreed to give a lecture series on reproducible research (RR) for medical students and postgraduate pharmacists involved in several areas of clinical research. Thus, I designed a piloting lecture series to highlight definitions of RR, reasons for RR, potential merits of RR, and ways to work accordingly. In trying to actually reproduce a published analysis, I encountered several practical obstacles. In this article, I focus on this working example to emphasize the manifold facets of RR, to provide possible explanations and solutions, and argue that harmonized curricula for (quantitative) clinical researchers should include RR principles. I therefore hope these experiences are helpful to raise awareness among educators and students. RR working habits are not only beneficial for ourselves or our students, but also for other researchers within an institution, for scientific partners, for the scientific community, and eventually for the public profiting from research findings.
Co-designing efficient machine learning based systems across the whole hardware/software stack to trade off speed, accuracy, energy and costs is becoming extremely complex and time consuming. Researchers often struggle to evaluate and compare different published works across rapidly evolving software frameworks, heterogeneous hardware platforms, compilers, libraries, algorithms, data sets, models, and environments. We present our community effort to develop an open co-design tournament platform with an online public scoreboard. It will gradually incorporate best research practices while providing a common way for multidisciplinary researchers to optimize and compare the quality vs. efficiency Pareto optimality of various workloads on diverse and complete hardware/software systems. We want to leverage the open-source Collective Knowledge framework and the ACM artifact evaluation methodology to validate and share the complete machine learning system implementations in a standardized, portable, and reproducible fashion. We plan to hold regular multi-objective optimization and co-design tournaments for emerging workloads such as deep learning, starting with ASPLOS18 (ACM conference on Architectural Support for Programming Languages and Operating Systems - the premier forum for multidisciplinary systems research spanning computer architecture and hardware, programming languages and compilers, operating systems and networking) to build a public repository of the most efficient machine learning algorithms and systems which can be easily customized, reused and built upon.
This paper introduces reproducible research, and explains its importance, benefits and challenges. Some important tools for conducting reproducible research in Transportation Research are also introduced. Moreover, the source code for generating this paper has been designed in a way so that it can be used as a template for researchers to write their future journal papers as dynamic and reproducible documents.
Targeted Learning is a subfield of statistics that unifies advances in causal inference, machine learning and statistical theory to help answer scientifically impactful questions with statistical confidence. Targeted Learning is driven by complex problems in data science and has been implemented in a diversity of real-world scenarios: observational studies with missing treatments and outcomes, personalized interventions, longitudinal settings with time-varying treatment regimes, survival analysis, adaptive randomized trials, mediation analysis, and networks of connected subjects. In contrast to the (mis)application of restrictive modeling strategies that dominate the current practice of statistics, Targeted Learning establishes a principled standard for statistical estimation and inference (i.e., confidence intervals and p-values). This multiply robust approach is accompanied by a guiding roadmap and a burgeoning software ecosystem, both of which provide guidance on the construction of estimators optimized to best answer the motivating question. The roadmap of Targeted Learning emphasizes tailoring statistical procedures so as to minimize their assumptions, carefully grounding them only in the scientific knowledge available. The end result is a framework that honestly reflects the uncertainty in both the background knowledge and the available data in order to draw reliable conclusions from statistical analyses - ultimately enhancing the reproducibility and rigor of scientific findings.
In this paper, we make an important step towards the black-box machine teaching by considering the cross-space machine teaching, where the teacher and the learner use different feature representations and the teacher can not fully observe the learners model. In such scenario, we study how the teacher is still able to teach the learner to achieve faster convergence rate than the traditional passive learning. We propose an active teacher model that can actively query the learner (i.e., make the learner take exams) for estimating the learners status and provably guide the learner to achieve faster convergence. The sample complexities for both teaching and query are provided. In the experiments, we compare the proposed active teacher with the omniscient teacher and verify the effectiveness of the active teacher model.
Quantum computing is a growing field at the intersection of physics and computer science. The goal of this article is to highlight a successfully trialled quantum computing course for high school students between the ages of 15 and 18 years old. This course bridges the gap between popular science articles and advanced undergraduate textbooks. Conceptual ideas in the text are reinforced with active learning techniques, such as interactive problem sets and simulation-based labs at various levels. The course is freely available for use and download under the Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International license.