In this post and the next, we will do the poll survey problem in two different ways. First, what number of candidates is required in the survey to form a signal of a 3.5 percentage point difference?
The steps are
find the signal (we know that already (3.5% or 0.035)
find the noise (standard deviation)
estimate signal/nose and equate it to 1.96/root(n)
estimate n
Standard deviation
Imagine a survey asking a random potential voter a question about a candidate. The answer is YES or NO. YES carries 1, and NO carries 0 value. Let p be the probability of getting a YES, something we don’t know now. From Bernaulli trial (this can be a decent Bernaulli trial), the standard deviation is p x (1-p) per trial. For p = 0.5 (equal probabilities for YES and NO), the standard deviation (sd) is 0.5. The value for sd is 0.49 for 60:40 and 40:60, 0.46 for 70:30 and 30:70 etc. Therefore, using a standard deviation of 0.5 in the poll won’t be a big crime.
Samples
signal/noise = 0.035/0.5 = 0.07
n = (1.96/0.07)2
= 780
or about 1000 people.
We will address the same problem in the opposite way in the next post.