Multivariate hypergeometric distribution :
The multivariate hypergeometric distribution is a probability distribution that describes the possible outcomes of drawing samples from a finite population without replacement. It is a generalization of the standard hypergeometric distribution, which only considers the case of two distinct groups within the population.
Suppose we have a population of N items, of which K are classified as type A and N-K are type B. If we draw n items from the population without replacement, the probability that X of the items are type A and Y of the items are type B can be calculated using the multivariate hypergeometric distribution as follows:
Probability = ( K choose X ) * ( N-K choose Y ) / ( N choose n )
Here, “choose” denotes the binomial coefficient, which is calculated as the number of ways to choose a subset of a given size from a set of items. For example, the expression ( K choose X ) represents the number of ways to choose X items from the K items that are type A in the population.
As an example, suppose we have a population of 10 items, of which 5 are red and 5 are blue. If we draw 3 items from the population without replacement, the probability that 2 of the items are red and 1 of the items are blue can be calculated as follows:
Probability = ( 5 choose 2 ) * ( 5 choose 1 ) / ( 10 choose 3 )
= ( 10 ) * ( 5 ) / ( 120 )
= 1/12
This means that if we draw 3 items from the population without replacement, there is a 1/12 probability that we will end up with 2 red items and 1 blue item.
As another example, suppose we have a population of 20 items, of which 8 are red, 6 are blue, and 6 are green. If we draw 6 items from the population without replacement, the probability that 2 of the items are red, 2 are blue, and 2 are green can be calculated as follows:
Probability = ( 8 choose 2 ) * ( 6 choose 2 ) * ( 6 choose 2 ) / ( 20 choose 6 )
= ( 28 ) * ( 15 ) * ( 15 ) / ( 38760 )
= 1/3240
This means that if we draw 6 items from the population without replacement, there is a 1/3240 probability that we will end up with 2 red items, 2 blue items, and 2 green items.
The multivariate hypergeometric distribution is useful in situations where we need to calculate the probability of drawing a specific combination of items from a finite population without replacement. It is a generalization of the standard hypergeometric distribution, which only considers the case of two distinct groups within the population, and allows us to calculate the probability of drawing any combination of items from the population.