Intro to Inferential Statistics

2015-03-06 17:58 | Source

jpeg

Statistics isn't only used to describe the data, to have a better undestanding. We could also infer upcoming data based on observations.This brach is callled Inferential Statistics, and of the method introduced here is Hypothesis Testing..These case are taken that given personal file to 48 supervisors, identical file, with different only in gender, there's 30% winning difference of male to female accepted to be promoted. All supervisors given random assignment which is to be expected, a controlled experiment.

The Null Hypothesis then said, there's nothing going on, two variables are independent, and that's happen due to chance. But the Alternative Hypothesis is the oppposite.

jpeg

These are the flow of Hypothesis Testing. Here we are taking the trial as the example. Presented with the evidence, we are questioned if the data is only due to chance. If the data is not true, that doesn't means that we prove the Defendant is innocent, it just means that the evidence is not enough to reject the null hypothesis.We're still doing it from the start under the assumption that Null Hypothesis is true. On the countrary if the data can reject the null hypothesis, then we succeed to prove Alternative Hypothesis. So there's two cases, first there's not enough data to reject the Null Hypothesis, or reject the Null Hypothesis.

jpeg

These are the hypothesis framework overview. The assumption of the Null Hypothesis always true, until there's prove with evidence to reject the Null Hypothesis and favor to the alternative. If we observe the probability at least as extreme when rejecting the Null Hypothesis is low, much lower than the probability study, then we reject the null hypothesis.

jpeg

To simulate this, we're doing simulation cards.

jpeg

In this case, the Null Hypothesis is stated there's no difference between men and women in getting promoted, and happen due to chance. By looking at the dot plot above, we've seen that at extreme condition,at least as 30%(probability observed in the data, which is p > abs(0.3)) the probability gaining 30% difference is very low. That's why we reject the null hypothesis.

In other words, Hypothesis Testing is testing our Null Hypothesis of 30% difference probability. The probability is very low, and thus we reject our Null Hypothesis.

jpeg

So the p-value is one the most common when deciding two competing hypothesis.More on this later.

RESOURCES :

https://class.coursera.org/statistics-003