A probability density function, often abbreviated PDF, models probability over a continuous range. The probability of an event occurring between two values is equal to the area under the curve between the two values. The total area under a probability density function is equal to 1.
There are a number of naturally occurring and useful probability density functions.
The continuous uniform distribution is a distribution over a continuous range where the probability of an event occurring is equally likely as any other event. In applications, random number generators try and emulate continuous uniform distributions when producing random data. An example is the spreadsheet function RAND
which returns a number between the range [0,1] inclusive.
The normal distribution is a symmetric bell-shaped distribution that occurs naturally when collecting and measuring data. A normal distribution has two defining attributes: the mean which described the center or “balancing” point of the distribution and the standard deviation which describes how far values are distributed from the mean of the distribution. The area under the bell curve is proportional to its standard deviation as shown in the two normal distributions below:
The probability of an event occurring on a probability density function between two values, and , is equal to the area under the curve from to . The general integral forms for calculating probability for PDFs are given below:
Probability | Integral | Description |
---|---|---|
The probability of an event occurring below a threshold . | ||
The probability of an event occurring above a threshold . | ||
The probability of an event occurring between and . |
In practice, these integrals can be difficult to calculate by hand. Instead, a cumulative distribution function is used to calculate the integrals and their corresponding areas.
In statistic function libraries, probability density functions have cumulative distribution function counterparts. A cumulative distribution function, abbreviated CDF, provides a way to calculate the area under the curve to the left of a value. That information is enough to calculate the probability of an event above a value and the event between two values for a PDF.
The probability above a value is equal to , the total area under the curve, minus the probability below the value. This is given in the equation below:
The probability between to values and , where , is equal to the area below minus the area below . This is given in the equation below: