1 min readJan 27, 2020
How is using Google’s criteria for stopping a bandit algo search leading to “the same certainty that version B was 5% better”” (that is not what a hypothesis test would test and thus result in, btw)? What are your grounds for equating the two, other than “Google says so” (I’ve studied their “documentation” and I don’t believe they do)?