When designing an A/B Testing experiment, there's four main considerations:

Choosing subject
Choosing population
Size
Duration

In this blog, we want to choose the subject of our experiment. That is unit to measure for our control and experiment group. This unit will be called unit of diversion.

When choosing subject for our experiment, we need one unit to represent the subject of our experiment. The subject can be events and people. Assume we randomly assign events to both groups, say pageviews events to control and experiment groups. But this is not feasible, as some of them may refresh it and load the page twice than other people. The other thing that we could do is use people. Now people can be identified by their account. But like in Udacity example, there are people who uses multiple account. Like people with corporate account and student account. Or people with multiple account that want to solve quizzes with different perspective. So account may also not a good idea. Let's try cookie for example. Cookie is identified based on particular device and browser, usually in form of user agent, that we can put in their browser cookies. But cookie can also not feasible, as like account, people could have multiple browser, multiple device, or they may refreshe their browser cookies and we assign them again with different cookies which leads the machine to think this is different person. There's pros and cons when using unit of subject that's not always ideal,infeasible, called unit of diversion.

jpg

Screenshot taken from Udacity, A/B Testing, Unit of Diversion Example

One is user id, the most stable personal identifiable using account(email or phone number) to identify user. Next cookie, which are change if you change browser/device, or if your browser allow restart cookies everytime your browse. Event is the least consistent because it's not personally identifiable. Device id can't be changed but it's tied to one device only. IP address changes if location changes.

jpg

Screenshot taken from Udacity, A/B Testing, Unit of Diversion Example

Let's take it for example, suppose we have set of events and checking whether the unit of diversion will switch to experiment or control groups. When user visit homepage and not sign-in, they identified as anon-id, and assigned first when log in and doesn't get change afterwards. Cookie-id change when in different device. Event changes as users iterating these set of events. Device id can't assigned to desktop device, but it change when users use mobile device. While ipadress, if you visit first event will be assigned, but we don't know whether the ip will change after events.

There are three main thing to consider when choosing unit of diversion. That is:

Consistency of Diversion
Ethical Considerations of Diversion
Variability

Consistency of Diversion¶

User and cookie are the only thing that almost consistent. Aside from which unit you use, you also want to decide which you want to measure for your experiment. You want to use cookie for example to experiment latency changes, that could have different impact across time. IP-based is the least consistent unit even compared to event based unit. IP unit is useful when you want to test latency of your site across different internet provider. You could do post analysis to decide which are comparable for your control and experiment group.

jpg

Screenshot taken from Udacity, A/B Testing, Consistency of Diversion Example

We can choose the consistency of diversion based on whether the users notice. Most of the time, choosing higher level of consistency is more expensive. Here we have the order of the least consistent to most consistent: event, cookie, and user id. When we change the video load time, user won't probably notice. So we can use event for unit, but beware if it changes the learning effect as well, and if it does, switch to cookie based.

When we change the UI of a button, it can be changed if the user reload the page. So event based can't be used, instead we go with cookie based. UI usually different between mobile and desktop page. So we can use cookie based. Changing the order of search results also won't notice the users, so we can go to event based. Adding instructor notes when testing quizz that has low pass rate is will be user-id. Since User-id guarantee that when study finish the quizz one, it will also updated across device.

Ethical Considerations for Diversion¶

In previous blog, the policy about how the data we are collecting and what type of identifiable person is important. When choosing user-id, users exposed about their email address, and also possibly phone number. You have two choice here. If you have a good system, then this will be an easy one. But if not, then you will convert all of user-id into unindentifiable. The second one is that, you have to provide form of consent to the participants, where event-based or cookie-based don't identify which person, they don't require you to provide such thing.

jpg

Screenshot taken from Udacity, A/B Testing, Ethical Considerations Example

When sensitive data already being collected previously, experiments such as newsletter prompt after user starting course is unnecessary. No new data is collected, so ethical review won't be necessary. The second one, when prompted before user registered, is necessary. User that not yet register need to informed that their email will be stored. And finally, changes course overview doesn't need user personal information, so cookie is just enough.

Unit of Analysis vs Unit of Diversion¶

In previous blog, we're talking about the variability. It divided into two kind, analytical variability and empiric variability. Empiric variability can sometimes in factor of 4 or 5 higher than analytical. This happen depends in the case of your denominator, the unit of analysis. Take CTR for example, where clicks divided by pageviews. So the unit of analysis will be pageviews. If you also use pageviews (event-based) as your unit of diversion, than your variability between analytical and empiric won't differ so much. But if you use cookie/user-id as unit of diversion, it will have much higher variability. The reason is when event is used in both unit, it will be picked as random choice. But when cookie/user chosen, event is actually depend (correlate) with group of users, and it will trigger much variability.

jpg

Screenshot taken from Udacity, A/B Testing, Unit of Analysis vs Diversion Example

So Diane and other colleagues at Google once measure the coverage of metric variability . The denominator is queries (unit of analysis), event based. The unit of diversion is tested using query or cookie. When choosing query as unit of diversion, it approximate what analytical Standard error will estimate. And this confirmed her statement is that when both unit have same metric, it will approximate. But when cookie based used, the variability is much higher than analytical, which what you see in plot above. 2 lines calculated in empiric variability, and x-axis computed by analytical variability.

jpg

Screenshot taken from Udacity, A/B Testing, Unit of Analysis vs Diversion Example

Take a look at the example. The first and second, the experiment will be diverse since unit of diversion and unit of analysis is different. One user/cookie will generate multiple events, and the events could be mixed in both control and experiment group. Doing this will not make events independent of another that generated by same user.

Disclaimer:

This blog is originally created as an online personal notebook, and the materials are not mine. Take the material as is. If you like this blog and or you think any of the material is a bit misleading and want to learn more, please visit the original sources at reference link below.

References:

Diane Tang and Carrie Grimes. Udacity

Subject of Experiment

Consistency of Diversion¶

Ethical Considerations for Diversion¶

Unit of Analysis vs Unit of Diversion¶