First of all, I'd like to apologize for the tone of my previous post. I was struggling through too many systems to post online and was really bollixing it up; that frayed my temper. Then, too, people are dying right now because they or others think that an ignorant guess overrides the expertise of public-health professionals. Even so, nobody dies from an ignorance of statistics, and I shouldn't have been so sarcastic.
Polls have many things which mess them up. Think of pre-election polls where we have the data later. Smith is running against Joes, and the pollster calls up to see who the person is voting for. They can't reach some; some refuse to answer; some lie; some change their minds between the phone call and the voting booth. None of that involves statistics.
Let's get some terms. There is a population which has some characteristic of its members. We are taking a sample from that population. We are going to see what fraction of the sample has the characteristic, and we will use that as an estimate of the fraction of the population has that characteristic. In statistics, we always assume that it is a fair ample, that is to say, that every element of the population has an equal chance to be in the sample.
Now, there are two ways that we can do this. We can take one element, record the characteristic, and set it aside, or we can take one element, record the characteristic, and return it to the population. The second method is called with replacement. It is clear that the method without replacement yields more accurate results. Statisticians do their calculations on the method with replacement. It is easier, and since it is always less accurate than the method without replacement, it provides a minimum for the method without replacement.
Statistics is the branch of applied mathematics which provides you with precise answers to questions other than the questions you have in mind. When you have tested a sample, you want to know how well it represents the population. You want to know what the probabilistic distribution of the values for the population is. What statistics tells you โ without further assumptions โ is given a population with its characteristics โ what the distribution of samples of a certain size is. Let's say that you have a bag of three red marbles and two green marbles. You take marbles out one at a time, record the color, and replace them. You have 0.121 of getting 8 green marbles (counting getting the same marble twice as 2), 0.215 of getting 7 green marbles, take one element, record the characteristic and 0.251 of getting 6 green marbles, 0.184 of getting 5 green marbles, etc.
With Replacement, the same probabilities hold for a sample size of ten from a bag containing 3,000,000,000 green marbles and 2,000,000,000 red ones Now, as the numbers show, a sample size of ten is really insufficient. The sale size, however, provides the accuracy. The population size is irrelevant (except for very small populations restricting the possible diversity of the population. If you have only two marbles in the bag and the sample has a red marble and a green marble, then the population must break down 50-50.)