There's many things to take into account when choosing which size for your experiment. Practical significance level, statistical significance level, sensitivity, metric, cohort, population will result in different variability.
Variability and the duration of your metric. Suppose you want to run an experiment that will affect global user. Running experiment worldwide is time consuming since you observe a lot of users. What you want to do is take subset of population, doing cohort for example. Choosing this will give you much smaller size and different variability. But it will give you some intuituion whether your experiment is actually have an effect.
Suppose you know that from video latency example in previous blog, what you're really want is people with 90th percentile, that is people with slower internet connection. And because you want to have immediate feedback, you cohort based on users that last activity seen in 2 month. This experiment could give you decision whether you want to continue for worldwide experiment.
When choosing population, we left with two kind of user experiment, inter and intra user. Intra user is like event-based experiments, where it could be same user to the same group. Intra user let same user in control and experiment group. The thing to keep in mind is that you don't want to run the experiments before and after big event, like Christmas. This could be vary greatly.
The other alternative is inter-user experiments. This would means that different user in both groups. And we also want to keep something like Cohort, or lurking variables. It's the variable that potentially makes bias if we don't divide equal features for both group. Medical experiment usually more sensitive, randomly assign equal of both groups, variables like demographic, gender, age, etc. A/B testing on internet experiments lack of such things. We don't event know whether the participants is real people.