The normal distribution is a continuous probability distribution that appears naturally in statistics and probability. The shape of the distribution is a “bell-curve” that whose center is equal to the mean of the distribution. The area under the curve is equal to and can be used to calculate the probability of an event occuring in a range of values.
The function shown above gives the general form of the normal distribution in terms of the standard deviation and mean of the distribution. This function describes a family of probability density functions that can be used to calculate probability. Note, probability density function is often abbreviated as PDF.
|The standard deviation, denoted with the symbol (sigma), describes how far the population is distributed around the mean of the population.|
|The circle constant (tau) appears as a scaling factor that ensures the area under the distribution is equal to .|
|Euler’s Number is shorthand for the exponential function where and gives the expression useful properties in addition to making the values of the other variables more meaningful.|
|The mean of the population, denoted with the symbol (mu), describes the center of the distribution. The bell-curve is symmetrical around the mean.|
|The input .|
Given the input the function returns the relative likelihood of the event of occuring. The area under the function can be used to calculate the probability of an event occuring for a range of values. This is discussed below.
Note, the standard normal distribution is a special case of the normal distribution where the mean is and the standard deviation is . This distribution has historical significance, because it allows values to be referenced in a lookup-table rather than calculated by hand. Of course, computers make computing values on and areas under the variations of the distribtuion trivial.
- The area under the curve is equal to .
- The mean (mu) is the center of the distribution.
- The standard deviation (sigma) describes how far values are from the mean.
For example, the properties of the normal distribution are visualized by the plots below of normal distributions with a mean of and standard deviations of , and . Note, while the shape of the function changes, the area relative to the standard deviation stays the same.
The probability of an event occuring on a probability density function between two values, and , is equal to the area under the curve from to . For example, the probability of an event occuring within standard deviation of the mean of a normal distribution is equal to . The general integral forms for calculating probability for PDFs are given below:
|The probability of an event occuring below a threshold .|
|The probability of an event occuring above a threshold .|
|The probability of an event occuring between and .|
In practice, these integrals prove tricky to calculate. Instead, the normal cumulative distribution function (CDF) is usually used. The normal CDF returns area under the curve to the left of a value, which corresponds to the first case . This alone is enough to find the other integegrals. These strategies are summarized below, before defining the normal CDF.
The probability of an event occuring below a threshold is equal to the integral from negative infinity to the threshold. This is what the normal CDF returns.
The probability of an event occuring above a threshold is equal to minus the probability of the event occuring below the threshold. This is given in the equation below:
The probability between to values and , where , is equal to the area below minus the area below . This is given in the equation below: