ترغب بنشر مسار تعليمي؟ اضغط هنا

Controlled Experimentation in Continuous Experimentation: Knowledge and Challenges

55   0   0.0 ( 0 )
 نشر من قبل Florian Auer
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Context: Continuous experimentation and A/B testing is an established industry practice that has been researched for more than 10 years. Our aim is to synthesize the conducted research. Objective: We wanted to find the core constituents of a framework for continuous experimentation and the solutions that are applied within the field. Finally, we were interested in the challenges and benefits reported of continuous experimentation. Method: We applied forward snowballing on a known set of papers and identified a total of 128 relevant papers. Based on this set of papers we performed two qualitative narrative syntheses and a thematic synthesis to answer the research questions. Results: The framework constituents for continuous experimentation include experimentation processes as well as supportive technical and organizational infrastructure. The solutions found in the literature were synthesized to nine themes, e.g. experiment design, automated experiments, or metric specification. Concerning the challenges of continuous experimentation, the analysis identified cultural, organizational, business, technical, statistical, ethical, and domain-specific challenges. Further, the study concludes that the benefits of experimentation are mostly implicit in the studies. Conclusions: The research on continuous experimentation has yielded a large body of knowledge on experimentation. The synthesis of published research presented within include recommended infrastructure and experimentation process models, guidelines to mitigate the identified challenges, and what problems the various published solutions solve.



قيم البحث

اقرأ أيضاً

Many aspects of Schubert calculus are easily modeled on a computer. This enables large-scale experimentation to investigate subtle and ill-understood phenomena in the Schubert calculus. A well-known web of conjectures and results in the real Schubert calculus has been inspired by this continuing experimentation. A similarly rich story concerning intrinsic structure, or Galois groups, of Schubert problems is also beginning to emerge from experimentation. This showcases new possibilities for the use of computers in mathematical research.
72 - Andrew Moylan 2007
The software tool GRworkbench is an ongoing project in visual, numerical General Relativity at The Australian National University. This year, GRworkbench has been significantly extended to facilitate numerical experimentation. The numerical different ial geometric engine has been rewritten using functional programming techniques, enabling fundamental concepts to be directly represented as variables in the C++ code of GRworkbench. Sophisticated general numerical methods have replaced simpler specialised algorithms. Various tools for numerical experimentation have been implemented, allowing for the simulation of complex physical situations. A recent claim, that the mass of the Milky Way can be measured using a small interferometer located on the surface of the Earth, has been investigated, and found to be an artifact of the approximations employed in the analysis. This difficulty is symptomatic of the limitations of traditional pen-and-paper analysis in General Relativity, which was the motivation behind the original development of GRworkbench. The physical situation pertaining to the claim has been modelled in a numerical experiment in GRworkbench, without the necessity of making any simplifying assumptions, and an accurate estimate of the effect has been obtained.
We describe our framework, deployed at Facebook, that accounts for interference between experimental units through cluster-randomized experiments. We document this system, including the design and estimation procedures, and detail insights we have ga ined from the many experiments that have used this system at scale. We introduce a cluster-based regression adjustment that substantially improves precision for estimating global treatment effects as well as testing for interference as part of our estimation procedure. With this regression adjustment, we find that imbalanced clusters can better account for interference than balanced clusters without sacrificing accuracy. In addition, we show how logging exposure to a treatment can be used for additional variance reduction. Interference is a widely acknowledged issue with online field experiments, yet there is less evidence from real-world experiments demonstrating interference in online settings. We fill this gap by describing two case studies that capture significant network effects and highlight the value of this experimentation framework.
Online experimentation platforms abstract away many of the details of experimental design, ensuring experimenters do not have to worry about sampling, randomisation, subject tracking, data collection, metric definition and interpretation of results. The recent success and rapid adoption of these platforms in the industry might in part be attributed to the ease-of-use these abstractions provide. Previous authors have pointed out there are common pitfalls to avoid when running controlled experiments on the web and emphasised the need for experts familiar with the entire software stack to be involved in the process. In this paper, we argue that these pitfalls and the need to understand the underlying complexity are not the result of shortcomings specific to existing platforms which might be solved by better platform design. We postulate that they are a direct consequence of what is commonly referred to as the law of leaky abstractions. That is, it is an inherent feature of any software platform that details of its implementation leak to the surface, and that in certain situations, the platforms consumers necessarily need to understand details of underlying systems in order to make proficient use of it. We present several examples of this concept, including examples from literature, and suggest some possible mitigation strategies that can be employed to reduce the impact of abstraction leakage. The conceptual framework put forward in this paper allows us to explicitly categorize experimentation pitfalls in terms of which specific abstraction is leaking, thereby aiding implementers and users of these platforms to better understand and tackle the challenges they face.
When the Stable Unit Treatment Value Assumption (SUTVA) is violated and there is interference among units, there is not a uniquely defined Average Treatment Effect (ATE), and alternative estimands may be of interest, among them average unit-level dif ferences in outcomes under different homogeneous treatment policies. We term this target the Homogeneous Assignment Average Treatment Effect (HAATE). We consider approaches to experimental design with multiple treatment conditions under partial interference and, given the estimand of interest, we show that difference-in-means estimators may perform better than correctly specified regression models in finite samples on root mean squared error (RMSE). With errors correlated at the cluster level, we demonstrate that two-stage randomization procedures with intra-cluster correlation of treatment strictly between zero and one may dominate one-stage randomization designs on the same metric. Simulations demonstrate performance of this approach; an application to online experiments at Facebook is discussed.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا