Is it possible to predict the results of an election? Here is how to do it.
Bootstrapping is a method to sample a population and assign and determine bias, variance, confidence intervals and prediction error within your sample of the population. We can then use the proportions of our sample to estimate the proportions of the population.
We often use bootstrapping in medical trials where we take a small proportion of the population and provide a treatment to one group and a placebo to the other group. The Expected value of the entire population and the standard error in our prediction of the two groups is then measured to determine the effect of the disease with and without treatment.
By knowing the Expected value (EV) and Standard error (SE) in our sample we can find the confidence interval (CI)- hence determine whether our results were a fluke or whether the difference in our sample is of statistical significance.
- EVproportion=meanbox=population proportion
- SEproportion=SDbox/√sample size
As we have seen before, our measurements lie on a standard curve. Where moving 1 SD from the mean will cover 68% of our data, two standard deviations from the mean will cover 95% of our data and 3 SD from the mean will cover 99.7% of our data.
Therefore, taking our EV of the proportion and +/- the SE * 2 we get a confidence interval of 95%. This is the conventional size of the confidence interval.
This means taking a series of samples of our population and calculating the 95% confidence interval of each of those samples then 95 out of 100 times you do this – the true mean value for the population will be in those intervals.
Back to our medical trial example. We can use the confidence interval to determine whether the treatment group performed better due to a sampling fluke, or whether this trend would be found if the entire population after treatment.