Containing ramp spillover with partitioned ramps & hash based splits

Ramping-up experiments (from limited canary releases) is a popular way of managing risks and bugs in experiments before exposing all your traffic. A typical ramp-up process looks like this:

Launch an experiment to 10% of traffic
Check your guard rail metrics & for any errors from your experiment
If everything looks good, you can usually:
- A) Ramp-up the existing experiment to 100% traffic without re-assigning users
- B) Restart & re-assign users in the experiment at 100% traffic

Choosing option A, and ramping from 10%->100%, can impact your split test results through Simpson's Paradox (see Section #6). Meanwhile option B - restarting the experiment and re-assigning subjects - can cause spillover where users in the Control group are exposed to the intervention in the Treatment group and vice versa. Option B dilutes the results of your experiment and can mute the effect you hope to measure.

Restarting without acknowledging the prior run, means you'll treat all users as part of the same population to draw from:

Spillover dilutes the effect of your treatment group's intervention on users

Users from the initial 10% run will be randomly assigned to your new 100% run, but those users who swapped variants between runs may dilute your results. It's important because the 10% of users who get reassigned skew toward your most loyal, frequent users.

Option: `C` Partitioned ramps

Lukas Vermeer, director of Experimentation at Booking.com advocates a third and perhaps superior way to ramp experiments. Users assigned during the initial 10% run, can be excluded during the ramped-up run. It only costs a small amount of statistical power.

Spillover dilutes the effect of your treatment group's intervention on users

In this case, none of the users in the 10% run will be included within the 100% ramp.

Prerequisites: You roughly understand how hash-based user assignment in split tests work.

1. Before the experiment

Define a custom decisionAdapter inside your shared code - it needs to:

Hash a user ID defined at Mojito.options.userId
Recognise & apply the excludeSampleRate parameter during the test sample rate decision (when test.options.decisionIdx === 1)

Here's what we recommend:

// Fetch or generate a user ID for use in your test bucketing decisions
Mojito.options.userId = Mojito.utils.getMojitoUserId();

// Override the default decision adapter with this custom one
Mojito.options.decisionAdapter = function (test) {
    // Calculate decision from userId MD5 digest
    test.options.digest = test.options.digest || Mojito.utils.md5(Mojito.options.userId + test.options.id);
    var startPos = test.options.decisionIdx * 8,
        endPos = startPos + 8,
        digest = test.options.digest,
        decision = parseInt(digest.substring(startPos, endPos), 16) / 0xFFFFFFFF;

    // Partitioned ramps - By Lukas Vermeer https://lukasvermeer.nl/
    // Exclude users below a test's excludeSampleRate threshold to avoid spillover between ramps
    if (test.options.decisionIdx === 0 && test.options.excludeSampleRate && decision < test.options.excludeSampleRate) {
        return 2;
    }

    return decision;
};

2. Canary release to 10%

When you launch your experiment, you'll need to set it live for:

Set the sample to 10%: sampleRate: 0.1
Disable cookies for your test's state (i.e. Generate a decision from your userId every time the test activates): options.cookieDuration: -1

Your experiment YAML might resemble this:

state: live
id: ex3
name: Homepage button
recipes:
  '0':
    name: Control
  '1':
    name: Treatment
    css: 1.css
trigger: trigger.js

# 1. Canary release parameters: 10% sample rate, disable cookies
sampleRate: 0.1
options:
  cookieDuration: -1

# 2. Ramp-up parameters: 100% sample rate, initial 10% excluded, cookies re-enabled (optional)
#sampleRate: 1
#excludeSampleRate: 0.1

3. 100% ramp-up

When it comes time to ramp up, you'll need to:

Set the sample rate to 100% (effectively 90%): sampleRate: 1
Exclude the first 10% of users: excludeSampleRate: 0.1
(Optional) Re-enable cookies, by removing the options.cookieDuration: -1 parameter

state: live
id: ex3
name: Homepage button
recipes:
  '0':
    name: Control
  '1':
    name: Treatment
    css: 1.css
trigger: trigger.js

# 1. Canary release parameters: 10% sample rate, disable cookies
#sampleRate: 0.1
#options:
#  cookieDuration: -1

# 2. Ramp-up parameters: 100% sample rate, initial 10% excluded, cookies re-enabled (optional)
sampleRate: 1
excludeSampleRate: 0.1

Wrapping up

Those are some options for managing split test ramping. Each have their trade-offs, and in an ideal world, you would launch every experiment to 100% of traffic - and use Real-time data to decide whether to pull them or not.

Short of living in an ideal world, you have Mojito, which provides you with the control you need to manage your assignment around these risks.

Option: C Partitioned ramps​

1. Before the experiment​

2. Canary release to 10%​

3. 100% ramp-up​

Wrapping up​

Option: `C` Partitioned ramps

1. Before the experiment

2. Canary release to 10%

3. 100% ramp-up

Wrapping up