Guidelines for Experiment Launches and Ramp-ups at Medium
Below are some thoughts on setting up experiments - a work in progress, and please feel free to add / comment as appropriate!
For specifics on setting up an experiment and ramping via go/experiments - click here [link redacted].
1) Experimentation is different from a roll-out: If we do not know whether something will work / whether we want to move forward, we should always experiment. Specifically, we should hold at max power ramp (explained below) for a statistically significant read before rolling out. A rollout is when we ramp straight up to a high percentage (e.g., 95% or 100%) - this is not experimentation.
2) We should aim to run all experiments at maximum statistical power unless there is a PR / other concern to run them at lower power. To get to maximum statistical power, divide 100 by the number of variants (including control). E.g., if you have 3 variants + control, maximum power ramp would be 25% per variant and 25% control.
- Product Science can provide an estimate of how long we should plan to hold experiments at the max power ramp to get a statistically significant (stat sig) read.
3) All experiments should run orthogonal to one another: i.e., they should use unique md5 hash IDs such that the members that fall into variants of one experiment are distributed evenly throughout variants of another experiment. This avoids experiment bias. There are very rare exceptions to this rule and should only happen after talking to Product Science.
- Different hash makes the experiment orthogonal.
- Exceptions to running experiments in separate buckets using same hash: