In a population, there is a prevalence of some very rare disease. Now, you take a sample from the population says of size 100. But you observe no such persons with that disease. So, by the classical statistical procedures, the natural inference is that there is no such person with such disease in the population.

But, you know that it is an absolutely wrong conclusion.

So, Sequential Process of sample selection comes to our rescue.

Let us first mathematically model the scenario.

Let \(X_1, X_2, …\) be the sample we are taking from the population representing whether the \(i\)th person is suffering from the disease or not..

Naturally, we model it by \(X_i\) are iid and ~ Ber\((p)\), where we want to estimate \(p\).

What we do?

The idea is simple, we keep on drawing fresh sample unit from the population until we get someone with that disease. Then we stop.

Do, this process remind you of something?

Yes, you are right! This process is similar to the Geometric Random Variable.

So, here we are looking into \(Y\) ~ Geom(\(p\)), where \(p\) is the probability of the disease prevalence in the population.

Let \(N_0\) is the number of the sample for which the first diseased person appear, then what is the natural estimate of \(p\)?

\(p\) is estimated by \(\frac{1}{N_0}\).

But an important question is Does this \(N_0\) exist? In other words, does this process converge.

If this doesn’t converge, then it means the population doesn’t have any such disease prevalent. For other cases, it may require a mathematical proof of P(\(N_0 < \infty \)) = 1.

In the next post, we go on to study SPRT (Sequential Probability Ratio Test).

Leave a Reply

Your email address will not be published. Required fields are marked *

Go Top