Shows that data near the mean are more frequent in occurrence than data far from the mean
The classical Bell-shaped curve
Overview of Normal Distribution
The normal distribution, also known as the Gaussian distribution, is a continuous probability distribution that is symmetrical around its mean. It is one of the most important distributions in statistics because many phenomena in the natural and social sciences tend to approximate a normal distribution. The normal distribution is characterized by its bell-shaped curve.
Characteristics of the Normal Distribution
- Symmetry:
- The normal distribution is symmetrical about its mean, meaning that the left and right halves of the distribution are mirror images.
- Bell-Shaped Curve:
- The shape of the normal distribution curve is bell-shaped, with the highest point at the mean and tails that extend infinitely in both directions.
- Mean, Median, and Mode:
- In a normal distribution, the mean, median, and mode are all equal and located at the centre of the distribution.
- Standard Deviation:
- The spread of the normal distribution is determined by its standard deviation. A larger standard deviation results in a wider, flatter curve, while a smaller standard deviation results in a narrower, taller curve.
Properties of the Normal Distribution
- 68-95-99.7 Rule (Empirical Rule):
- Approximately 68% of the data falls within one standard deviation of the mean.
- Approximately 95% of the data falls within two standard deviations of the mean.
- Approximately 99.7% of the data falls within three standard deviations of the mean.
- Total Area Under the Curve:
- The total area under the normal distribution curve is equal to 1, representing the total probability of all outcomes.
Standard Normal Distribution
- The standard normal distribution is a special case of the normal distribution with a mean of 0 and a standard deviation of 1.
- Values from a normal distribution can be transformed into the standard normal distribution using the z-score formula:z = (X - μ) / σ where:
- X is the value from the original normal distribution.
- μ is the mean of the original normal distribution.
- σ is the standard deviation of the original normal distribution.
- Z-scores represent the number of standard deviations a value is from the mean, allowing comparison across different normal distributions.
Applications of the Normal Distribution
- Statistical Inference:
- Many statistical tests and confidence intervals are based on the assumption of normality.
- Central Limit Theorem: For sufficiently large sample sizes, the sampling distribution of the sample mean approaches a normal distribution, regardless of the population's distribution.
- Quality Control:
- Normal distribution is used in control charts to monitor processes and detect variations.
- Natural and Social Sciences:
- Many variables in biology, psychology, economics, and other fields are approximately normally distributed, allowing for the use of normal distribution models.
Checking for Normality
- Visual Methods:
- Histogram: A graphical representation of the data that can show the bell-shaped curve of a normal distribution.
- Q-Q Plot: A plot of the quantiles of the data against the quantiles of a normal distribution; a linear pattern indicates normality.
- Statistical Tests:
- Shapiro-Wilk Test: A statistical test that assesses the normality of a data set.
- Kolmogorov-Smirnov Test: Compares the sample distribution with a specified distribution (e.g., normal distribution).
Summary
The normal distribution is a fundamental concept in statistics, characterized by its symmetrical, bell-shaped curve. It is defined by its mean and standard deviation, with key properties such as the 68-95-99.7 rule. The standard normal distribution allows for the comparison of different normal distributions through z-scores. Understanding and applying the normal distribution is crucial in statistical inference, quality control, and various scientific fields.