Skip to Tutorial Content

Introduction

Introduction to Urn Problems

In this learnR tutorial, we will be using Urn Problems to illustrate different probability distributions. The general idea of an urn problem is that you imagine an urn filled with balls of different colours (here only two colours), from which you draw a certain number of balls. One colour, here blue, is seen as a success. The drawing process can be restricted by certain rules, such as whether you sample with or without replacement, and when you want to stop drawing balls. Depending on the “rules” you apply to this general problem, you get different probability distributions. In this tutorial, we will briefly introduce you to the binomial, hypergeometric, geometric and negative hypergeometric distributions.

The Binomial Distribution

To get this distribution, we count the number of blue balls (successes) in a fixed number of draws with replacement. The formula for this is given by:

\[P(n\,blue\,balls)\,=\,\binom{n}{k}\cdot p^{ \ k}\cdot (1-p)^{n-k}\]

Where n = number of balls drawn (sample size), and k = number of blue balls drawn (sample successes). The content of the urn, i.e. how many red and blue balls there are in total, determines the probability of drawing k blue balls.

A good example for a binomial distribution is flipping a coin n times and examining the probability of getting a certain number of heads ( k ). If you assume that your coin has a probability of 0.5 of landing heads, and you flip the coin 4 times, what is the probability of getting 3 heads?

The Hypergeometric Distribution

To get a hypergeometric distribution, you draw balls out of the urn - without replacement - until n balls are drawn. The formula for this is given by:

\[P(n\,blue\,balls)\,=\,\frac{\binom{K}{k}\cdot\binom{N-K}{n-k}}{\binom{N}{n}}\]

With N = total number of balls (population size), K = total number of blue balls (successes in the population), n = number of balls drawn (sample size), and k = number of blue balls drawn (sample successes).

For example, imagine that you have a (slightly strange) sock drawer which contains 20 socks, of which 8 are blue and 12 are red. This gives you:

N = 20 K = 8

Randomly select 10 socks without replacement (n = 10) and count the number of blue socks you have. What is the probability of selecting 4 blue socks, i.e. k = 4?

Try using this urn to answer the question:

The Geometric Distribution

In order to arrive at a geometric distribution, draw balls with replacement until a blue ball (success) is drawn.

\[P(a\,success\,ball\, in \, round\,n )\,=\,p\cdot (1-p)^{n-1}\]

EXAMPLE ETC TBD


The Negative Hypergeometric Distribution

To get a negative hypergeometric distribution, you sample balls - without replacement - until n blue balls are drawn. We are here using n = 1 for simplicity purposes.

\[P(n\,blue\,balls)\,=\, \frac{\binom{N-n}{K-1}}{\binom{N}{K}}\]

EXAMPLE ETC TBD


Explaining Distributions with Urns

Sophia Crüwell and Nadine Koch