The standard normal distribution is the special case of a normal distribution with a mean of and a standard deviation of . The distribution has historical significance because it allows standardized values to be referenced in a look-up table rather than calculated by hand. The distribution is a probability density function with an area under the curve of the function equal to .
The standard normal distribution function is given in the equation above. This standardized form of the normal distrubtion allows for probabilities to be easily calculated or looked-up.
Variable | Description |
---|---|
The circle constant appears in the scaling factor that ensures the area under the distribution is equal to . | |
Euler’s Number is shorthand for the exponential function where and defines a family of exponential functions with useful properties and meaningful variable values. | |
Standardized input commonly referred to as a “-score value”. |
The standard normal distribution “standardizes” values using z-score values, which is why the input to the PDF and CDF are represented using the variable . A z-score value represents a value in terms of its distance from the mean in terms of standard deviations. For example, to calculate a z-score for the value on a normal distribution with a mean equal to and a standard deviation equal to the formula would be:
This value represents that the value is standard deviations to the right of the mean of the distribution. Since the standard normal deviation and another normal deviation share the same properties, the standard normal distribution can be used to calculate probabilities
The probability of an event occurring on a probability density function between two values, and , is equal to the area under the curve from to . For example, the probability of an event occurring within standard deviation of the mean of a normal distribution is equal to . The general integral forms for calculating probability for PDFs are given below:
Probability | Integral | Description |
---|---|---|
The probability of an event occurring below a threshold . | ||
The probability of an event occurring above a threshold . | ||
The probability of an event occurring between and . |
There are three common strategies, discussed below, for calculating these probabilities using the standard normal distribution: 1) Use a statistical function such as NORM.S.DIST
as implemented in Excel and Google Sheets. 2) Use the normal cumulative distribution function (CDF) defined with the error function. 3) Look up the probability corresponding to a z-value in a table.
= NORM.S.DIST(1.5) = .9332
The cumulative distribution function is described in the equation above. The output of the standard normal CDF is equal to the output of the NORM.S.DIST
function. For example, is equal to .
The output of the standard normal CDF corresponds area under the curve to the left of a value as shown in the graph below.
The plot of is given below.
The popularity of the standard normal distribution can likely be attributed to the difficulty of computing the area under the curve of the normal distribution. By standardizing the values of all normal distributions, different probabilities can be conveniently looked up in a table.
All three strategies for finding the area under the curve discussed above, find the area to the left of a threshold. These examples below demonstrate how that information is sufficient to find the probability of an event below a threshold, above a threshold and between thresholds.
The probability of an event occurring below a threshold on the standard normal distribution corresponds to the area under the curve to the left of the threshold.
The probability of an event occurring above a threshold on the standard normal distribution corresponds to the area under the curve to the right of the threshold. This probability is equal to minus the probability of the event occurring below the threshold.
The probability of an event occurring in between thresholds defined by the values and , where is equal to .