Population of Experiment

  |   Source

When choosing population, we left with two kind of user experiment, inter and intra user. Intra user is like event-based experiments, where it could be same user to the same group. Intra user let same user in control and experiment group. The thing to keep in mind is that you don't want to run the experiments before and after big event, like Christmas. This could be vary greatly.

The other alternative is inter-user experiments. This would means that different user in both groups. And we also want to keep something like Cohort, or lurking variables. It's the variable that potentially makes bias if we don't divide equal features for both group. Medical experiment usually more sensitive, randomly assign equal of both groups, variables like demographic, gender, age, etc. A/B testing on internet experiments lack of such things. We don't event know whether the participants is real people.

Target population

When choosing target population, you can choose variety of target features. You can choose language, location, browser, country, or device type.

Choosing target in advance is based on variety of reasons limiting user experience. When you have a high profile features that you want to add, you want to restrict users so it's not potentially leak to blog or news coverage. When you want to launch internationally, you want to test your language whether it's right for every language, like in japanese, chinese or Indonesian for example. You also don't want to risk running the same experiment that the other team in your company already do it. You can target the other section(country) and the other can run on the other side as well. And when choosing one section, like English people for example, keep in mind that you analyse and conclude the experiment based on English people only.

Choosing target population not in advance is based on other variety of reasons. You don't know the subject of the experiment. Do query on images won't exactly tell you which population. You're not sure the target is exact. Or it could be features that globally effect your population like 90%, so you really don't care that much. You also want to have some filtering in targeted and untargeted population in your experiments. It could get mixed up and distracts the results. Also check with engineering team how the feature is affect which population. And before making a big launch, you can run global experiment to make sure it doesn't do weird traffic to the target area you interested.


Screenshot taken from Udacity, A/B Testing, Unit of Analysis vs Diversion Example

In [1]:
0.0013 > 0.0025

On the other hand, New Zealand, is signifcant. We can check this by,

In [2]:
0.012 > 0.0082

So we conclude that using worldwide in experiment is insignificant, while in New Zealand is significant. The reason behind this is because the sample size is much larger (larger than requirement 10% of population), and variability could be not that much. Worldwide experiment could find person that is related to each other, so it distrupt independency. While in New Zealand, the sample is much smaller, hence much higher variability. This proves statement earlier that if you know your experiment will affect that traffic, you might want to target that traffic.

P.S observed difference, represented as $\hat{d}$ is proportion of control substracted by proportion of experiment(higher)

Population vs Cohort

There's also time when we want to choose population or cohort. Cohort is when you lock users to based on geographic region, device, or browser. Sometimes in the mids of experiment, there's user that restart their cookie cache, move that they outside country that you're targetting, or even stop at all not visiting your website anymore. Cohort means entering participants to a class, keep the users who enter the experiment at the same time for both groups. You don't count for any participants that join later, and you still take into account whether cohort users stop or changing device/browser etc.

You use cohort in experiments instead of population when:

  • Looking for learning effects
  • Examining user retention
  • Want to increase user activity
  • Anything requiring user to be established


Screenshot taken from Udacity, A/B Testing, Population vs Cohort Example

Suppose Audacity want to restructure their particular course. When starting an experiment, it doesn't make sense to include users before the experiment begin. They might be already finished the course and doesn't see the new structure of the course. So the experiment should include students that only started the course after the experiment begin. Students included split to both control and experiment groups. In this case cohort limit the population based on time, but it could be other features. Cohort limit your experiment to a subset of population, and like New Zealand example, will result in different variability.


This blog is originally created as an online personal notebook, and the materials are not mine. Take the material as is. If you like this blog and or you think any of the material is a bit misleading and want to learn more, please visit the original sources at reference link below.


  • Diane Tang and Carrie Grimes. Udacity