Probability Density Function

Probability Density Function

A probability density function, often abbreviated PDF, models probability over a continuous range. The probability of an event occurring between two values is equal to the area under the curve between the two values. The total area under a probability density function is equal to 1.

Types of Probability Density Functions

There are a number of naturally occurring and useful probability density functions.

Continuous Uniform Distribution

Continuous Uniform Distribution

The continuous uniform distribution is a distribution over a continuous range where the probability of an event occurring is equally likely as any other event. In applications, random number generators try and emulate continuous uniform distributions when producing random data. An example is the spreadsheet function RAND which returns a number between the range [0,1] inclusive.

Normal Distribution

Normal Distribution

The normal distribution is a symmetric bell-shaped distribution that occurs naturally when collecting and measuring data. A normal distribution has two defining attributes: the mean which described the center or “balancing” point of the distribution and the standard deviation which describes how far values are distributed from the mean of the distribution. The area under the bell curve is proportional to its standard deviation as shown in the two normal distributions below:

Normal Distribution area corresponding to standard deviation.

Calculating Probabilities

The probability of an event occurring on a probability density function between two values, and , is equal to the area under the curve from to . The general integral forms for calculating probability for PDFs are given below:

Probability Integral Description
The probability of an event occurring below a threshold .
The probability of an event occurring above a threshold .
The probability of an event occurring between and .

In practice, these integrals can be difficult to calculate by hand. Instead, a cumulative distribution function is used to calculate the integrals and their corresponding areas.

Cumulative Distribution Functions

In statistic function libraries, probability density functions have cumulative distribution function counterparts. A cumulative distribution function, abbreviated CDF, provides a way to calculate the area under the curve to the left of a value. That information is enough to calculate the probability of an event above a value and the event between two values for a PDF.

Probabilty Less than a value on the Normal Distribution

The probability above a value is equal to , the total area under the curve, minus the probability below the value. This is given in the equation below:

Probabilty Greater than a value on the Normal Distribution

The probability between to values and , where , is equal to the area below minus the area below . This is given in the equation below:

Probabilty Between Values on the Normal Distribution