A/B Testing Measurement Framework for Recommendation Models Based on Expected Revenue


Abstract in English

We provide a method to determine whether a new recommendation system improves the revenue per visit (RPV) compared to the status quo. We achieve our goal by splitting RPV into conversion rate and average order value (AOV). We use the two-part test suggested by Lachenbruch to determine if the data generating process in the new system is different. In cases that this test does not give us a definitive answer about the change in RPV, we propose two alternative tests to determine if RPV has changed. Both of these tests rely on the assumption that non-zero purchase values follow a log-normal distribution. We empirically validate this assumption using data collected at different points in time from Staples.com. On average, our method needs a smaller sample size than other methods. Furthermore, it does not require any subjective outlier removal. Finally, it characterizes the uncertainty around RPV by providing a confidence interval.

Download