# MTH5P2A: The Binomial Distribution

##### This unit explains The Binomial Distribution

Definition

The Binomial Distribution describes the behavior of a random variable(count variable)  X under the following conditions:

1. the number of trials(n) is fixed
2. trials have only 2 possible outcomes (success/failure)
3. each trial is independent of the other
4. probability of success(p) is constant throughout

X(the random variable) is a measure of the number of successes in n trials.

Example

A simple example is choosing 1 ball from a bag of 10 identical balls, each numbered (1-10). Once noted, the ball is returned to the bag.
A single ball is chosen on 3 separate occasions.

Success is in obtaining a ‘5‘ ball.

So the random variable X has values 0, 1, 2, 3

in other words, from our 3 tries we could have obtained:

0 fives,    1 five,   2 fives,   3 fives

On the first try, the probability of obtaining a 5 is 1/10 .
The probability of not getting a 5 is 9/10 .

Every time we dip into the bag of 10 balls, the probability of obtaining a ‘5’ is 1/10. The probability is constant.

Getting a ‘5’ or not getting a ‘5’ means that there are only 2 outcomes.

Every try is independent. Previous tries do not affect the result, since previously chosen balls are returned to the bag. Every try is taken from a bag of 10 balls.

Notation B(np)

The full notation describing a Binomial distribution is: where,

X a random variable (0, 1, 2, 3,…)
B   ‘is distributed Binomially
n number of trials
probability of single trial ‘success’

Example (continued from above)

Say that there are only 3 tries of attempting to take a 5-ball from a bag of 10 balls.

So n = 3 and p = 1/10.

The possible number of 5’s taken in the 3 trials is summarized by the values of the random variable X .

X = 0, 1, 2, 3

using the Binomial notation, Limits

The population size(n) of a Binomial Distribution must be much larger than the sample size(r).

The distribution only applies to trials from a simple random sample, where n is at least x10 times > r .

Outside this limit, results do not follow the equation.

Combinations   nCr

To appreciate the Binomial equation we must first have an understanding of combinations.

The definition of a Combination is: ‘the number of ways r items can be chosen from a set of nitems’

A short-hand way of writing this is nCr .

This can be written mathematically as: *note ! means ‘factorial’ – eg !3 = 3 x 2 x 1

Though an easier method of calculation, especially with large values of n, is to use a calculator.

Example

With 52 cards in a deck, how many ways can 3 different cards be chosen? Using a calculator,

52   SHIFT    nCr   3    =    22100

The Binomial Equation (i

where the probability of failure, Since, and the Binomial equation   (i   is sometimes expressed as: Example

Using the problem first given in Binomial distribution part 1 and extending it to probability prediction:

1 ball is chosen from a bag of 10 identical balls, each numbered (1-10). Once noted, the ball is returned to the bag.
A single ball is chosen on 3 separate occasions.
Success is in obtaining a ‘5‘ ball.

i) What is the probability of obtaining a single ‘5’ ball in 3 choices?

ii) Draw a tree-diagram to find the probability of obtaining a single ‘5’ ball in the last choice.

i) let X be the random variable for obtaining a ‘5’ ball.

X has possible values (0, 1, 2, 3).

The number of trials n = 3

The number of required successes r = 1

probability of single successful trial (ie a ‘5’ ball) = 1/10 ii) The probability(P) of getting a ‘5’ ball with 3 tries is the sum of each of the probabilities for each successful attempt.

The sum sign ‘+’ signifies an ‘OR’ decision.

The multiplication sign ‘x’ signifies the AND condition.

P = (q x q x p) + (q x p q) + (x q x q)

So the probability PL of the last ball drawn being a ‘5’ is given by:

PL = (q x q x p) = (9/10) (9/10) (1/10) = 81/1000 = 0.081

cumulative probability tables – the case of p<0.5

Cumulative probability tables – case of p<0.5

These give the tabulated value of P(Xx) . This means that the probability displayed is less than or equal to an observed value of x.
The random variable X is distributed Binomially, where there are n trials and probability of success p .

Before going into any detail about using the tables, we must first look at their structure.
There are a number of table designs, but they more or less contain the same data. It is just a matter of emphasis.

The tables we are using were issued by the Edexcel Examining Board(2009).

A binomial distribution ~ B(n, p) has values of p (across the top)and n (down the left side) in the following ranges:

 p 0.05, 0.10, 0.15, 0.20, 0.25, 0.30, 0.35, 0.40, 0.45, 0.50 n 5, 6, 7, 8, 9, 10, 12, 15, 20, 25, 30, 40, 50

You can download a PDF copy of these tables, other tables and information on equations for A-level mathematics from the link below.

Mathematical Formulae Statistical Tables

(to download right click – “save target as” )

Case    P(X<x)

This is a straight forward lookup of the table.

Say that the random variable X has values X = 0, 1, 2, 3, 4, 5

also that the probability of success p=0.35 and the number of trials n=6 .

We want to know the value of the probability that X is less than or equal to 3 . That is , the value of : P(X<3) From the table:           P(X<3) = 0.8826

The case      P(X=x)

Using the values of n and p from before, let’s say that we want to know the value of the probability that X is equal to 3. That is, the value of : P(X=3)

To understand this we must break down the values P(X<3) and P(X<2) into their constituent probabilities.

The probability that the random variable X is less than 3 or equal to 3 means that it can have values of ‘3’ or ‘2’ or ‘1’ or‘0’.

This can be written as the sum of probabilities. Remember for probability work, the operator ‘+‘ means OR .

P(X<3) = P(X=3) + P(X=2) + P(X=1) + P(X=0)

By the same reasoning,

P(X<2) = P(X=2) + P(X=1) + P(X=0)

subtracting the second equation from the first,

P(X<3) – P(X<2) = P(X=3)

turning the equation around,

P(X=3) = P(X<3) – P(X<2)

If we now look up the values for x = 3 and x = 2 from the tables:

P(X=3) = 0.8826 – 0.647  = 0.2355

The case     P(X<x)

Using the values of n and p from before, let’s say that we want to know the value of the probability that X is less than 3. That is, the value of : P(X<3)

The probability that the random variable X is less than 3 means that it can have values of ‘2’ or ‘1’ or ‘0’.

This can be written as the sum of probabilities. Remember for probability work, the operator ‘+‘ means OR .

P(X<3) = P(X=2) + P(X=1) + P(X=0)

but,

P(X<2) = P(X=2) + P(X=1) + P(X=0)

therefore,

P(X<3) = P(X<2)

since   P(X<2) = 0.6471

P(X<3) = 0.6471

The case     P(X>x)

Using the values of n and p from before, let’s say that we want to know the value of the probability that X is greater than 3. That is, the value of : P(X>3)

The probability that the random variable X is greater than 3 means that it can have values of ‘4’ or ‘5’ *

* the random variable X has values X = 0, 1, 2, 3, 4, 5

This can be written as the sum of probabilities. Remember for probability work, the operator ‘+‘ means OR .

P(X>3) = P(X=4) + P(X=5)

but the sum of all the individual probabilities equals ‘1’ .

P(X=0) + P(X=1) + P(X=2) + P(X=3) + P(X=4) + P(X=5) = 1

rearranging, making P(X=4) + P(X=5) the subject,

P(X=4) + P(X=5) = 1 – [ P(X=0) + P(X=1) + P(X=2) + P(X=3)]

but,

P(X<3) = P(X=0) + P(X=1) + P(X=2) + P(X=3)

therefore,

P(X=4) + P(X=5) = 1 – P(X<3)

hence,

P(X>3) = 1 – P(X<3)

from the table,

P(X<3) = 0.8826

therefore,

P(X>3) = 1 – 0.8826 = 0.1174

The case     P(X>x)

Using the values of n and p from before, let’s say that we want to know the value of the probability that X is greater than or equal to 3. That is, the value of : P(X>3)

The probability that the random variable X is greater than or equal to 3 means that it can have values of ‘3’ or ‘4’ or ‘5’ *

* the random variable X has values X = 0, 1, 2, 3, 4, 5

This can be written as the sum of probabilities. Remember for probability work, the operator ‘+‘ means OR .

P(X>3) = P(X=3) + P(X=4) + P(X=5)

but the sum of all the individual probabilities equals ‘1’ .

P(X=0) + P(X=1) + P(X=2) + P(X=3) + P(X=4) + P(X=5) = 1

rearranging, making P(X=3) + P(X=4) + P(X=5) the subject,

P(X=3) + P(X=4) + P(X=5) = 1 – [P(X=0) + P(X=1) + P(X=2)]

but,

P(X>3) = P(X=3) + P(X=4) + P(X=5)

and,

P(X<2) = P(X=0) + P(X=1) + P(X=2)

it follows that ,

P(X>3) = 1 – P(X<2)

putting in values,

P(X>3) = 1 – 0.6471 = 0.3529

Cumulative probability tables – case of p>0.5

These give the tabulated value of P(X< x) . This means that the probability displayed is less than or equal to an observed value of x.
The random variable X is distributed Binomially, where there are n trials and probability of success p .

Working out values of random variable probabilty P(X) for the case of p>0.5 is complicated by the fact that values of p only go up to 0.5 .

The way around this problem is to consider another random variable Y , representing failure.

So we have:                      pX + p= 1

In the same way as XY is distributed binomially:

Y ~ B(n,1 – pY)

Say that the random variable X has values X = 0, 1, 2, 3, 4, 5

A table of values for (success) and Y (failure) looks like this:

 X 0 1 2 3 4 5 Y 5 4 3 2 1 0

The method is to use the table to produce an expression in Y that will use values of p<0.5 .

The case      P(X<x)

Say we wish to find the value of P(X<4) for :

X = 0, 1, 2, 3, 4, 5      pX=0.85*       n=6

*tables only go up to 0.5

X (success) and Y (failure) are related so:

 X 0 1 2 3 4 5 Y 5 4 3 2 1 0

The sum of successes and failures for each outcome must always be the same (ie 5).

The probability for Y becomes py=0.15*     (pX + pY= 1)

* py <0.5 and therefore on the table

From the table,

P(X=0) + P(X=1) + P(X=2) + P(X=3) + P(X=4)

P(Y=5) + P(Y=4) + P(Y=3) + P(Y=2) + P(Y=1)

this can be written:

P(X<4) = P(Y>1)

since,

P(Y=0) + P(Y=1) + P(Y=2) + P(Y=3) + P(Y=4) + P(Y=5) = 1

P(Y=1) + P(Y=2) + P(Y=3) + P(Y=4) + P(Y=5) = 1 – P(Y=0)

in other words,

P(Y>1) = 1 – P(Y<0)

hence the original inequality can be rewritten :

P(X<4) = 1 – P(Y<0)

Using the tables to find the value of P(Y<0) for n=6 pY=0.15 ,Y=0 :

P(Y<0) = 0.3771 hence,

P(X<4) = 1 – 0.3771 = 0.6229

The case      P(X=x)

Consider the binomial distribution for success,

X ~ B(n, pX)

X = 0, 1, 2, 3, 4, 5      pX=0.85       n=6

also the binomial distribution for failure,

Y ~ B(n, pY)

Y= 0, 1, 2, 3, 4, 5      pY=0.15       n=6

Say we want to find P(X=4).

 X 0 1 2 3 4 5 Y 5 4 3 2 1 0

From the table it follows that,

P(X=4) = P(Y=1)

P(Y=1) = P(Y<1) – P(Y<0)

P(X=4) = P(Y<1) – P(Y<0)

P(X=4) = 0.7765 – 0.3771 = 0.3994

The case     P(X<x)

Consider the binomial distribution for success,

X ~ B(n, pX)

X = 0, 1, 2, 3, 4, 5      pX=0.85       n=6

also the binomial distribution for failure,

Y ~ B(n, pY)

Y= 0, 1, 2, 3, 4, 5      pY=0.15       n=6

Say we want to find P(X<4).

 X 0 1 2 3 4 5 Y 5 4 3 2 1 0

From the table it follows that,

P(X<4) = P(Y>1)

and

P(Y>1) = P(Y=2) + P(Y=3) + P(Y=4) + P(Y=5)

it follows that,

P(Y>1) = P(Y>2)

P(Y>2) = P(Y=2) + P(Y=3) + P(Y=4) + P(Y=5)

P(Y=0) + P(Y=1) + P(Y=2) + P(Y=3) + P(Y=4) + P(Y=5) = 1

P(Y=2) + P(Y=3) + P(Y=4) + P(Y=5) = 1 – [ P(Y=0) + P(Y=1)]

P(Y>2) = 1 – [ P(Y=0) + P(Y=1)]

P(Y>2) = 1 – P(Y<1)

P(X<4) = 1 – P(Y<1)

P(X<4) = 1 – 0.7765 = 0.2235

The case     P(X>x)

Consider the binomial distribution for success,

X ~ B(n, pX)

X = 0, 1, 2, 3, 4, 5      pX=0.85       n=6

also the binomial distribution for failure,

Y ~ B(n, pY)

Y = 0, 1, 2, 3, 4, 5      pY=0.15       n=6

Say we want to find P(X>4).

 X 0 1 2 3 4 5 Y 5 4 3 2 1 0

P(X>4) = P(Y<1)

P(Y<1) = P(Y=0)

P(Y=0) = P(Y<0)

P(X>4) = P(Y<0)

reading P(Y<0) directly from the tables,

P(X>4) = 0.3771

The case     P(X>x)

Consider the binomial distribution for success,

X ~ B(n, pX)

X = 0, 1, 2, 3, 4, 5      pX=0.85       n=6

also the binomial distribution for failure,

Y ~ B(n, pY)

Y = 0, 1, 2, 3, 4, 5      pY=0.15       n=6

Say we want to find P(X>4).

 X 0 1 2 3 4 5 Y 5 4 3 2 1 0

P(X>4) = P(Y<1)

reading P(Y<1) directly from the tables,

P(X>4) = 0.7765

The Mean

A random variable X distributed binomially, with trials n and constant probability of success p is described as:

B(n, p)

By definition the mean is described as,

mean = μ (mu) = E(X) = np

where E(X) is the expectation/expected value of X .

Example

The chance of getting a red sweet from a box of 40 coloured sweets is 1/10 . How many red sweets would you expect in each box?

Let the random variable of getting a red sweet be X .

Therefore,

B(40,1/10)

since the mean/expected value is given by:

μ = E(X) = np

μ = 40 x 1/10 = 4

answer: you would expect 4 red sweets in each box

Variance

For random variable X distributed binomially, with trials n and constant probability of success p ,

variance is defined as:

variance = σ2 (sigma squared) = Var(X) = np(1 – p)

Sometimes variance is written in terms of the probability of failure .

Since p + q = 1, then q = (1 – p). The equation for variance now becomes:

variance = npq

Example

A five sided spinner with numbers 1, 2, 3, 4, 5 on each sector is twirled 20 times and the number of ‘3’ s scored recorded each time.

i) How many times would you expect the ‘3’ to appear?
ii)What is the variance?

i) If X is the random variable distributed binomially,

B(20,1/5)

μ = E(X) = np

μ =20 x 1/5 = 4

answer: you would expect a ‘3‘ to be recorded 4 times

ii) since,

variance = npq

variance = 20 x 1/5 x 4/5 = 80/25 = 3.2

answer: the variance is 3.2

## STEM Elearning

We at FAWE have built this platform to aid learners, trainers and mentors get practical help with content, an interactive platform and tools to power their teaching and learning of STEM subjects, more

#### How to find your voice as a woman in Africa

© FAWE, Powered by: Yaaka DN.