MathematicsStatisticsA-Level
AQAIB

Correlation (PMCC)

Product Moment Correlation Coefficient.

Understand the formulaSee the free derivationOpen the full walkthrough

This public page keeps the free explanation visible and leaves premium worked solving, advanced walkthroughs, and saved study tools inside the app.

Core idea

Overview

The Pearson Product-Moment Correlation Coefficient (PMCC) serves as a statistical measure to quantify the strength and direction of a linear relationship between two continuous variables. It standardizes the covariance of the variables by the product of their standard deviations, resulting in a dimensionless index ranging from -1 to +1.

When to use: Apply this formula when analyzing paired quantitative data to see if a change in one variable corresponds to a proportional change in another. It is specifically designed for linear associations and assumes that the data is sampled from a bivariate normal distribution.

Why it matters: This coefficient is a cornerstone of predictive modeling, allowing scientists to identify patterns in climate data, economists to hedge risk in financial markets, and sociologists to find links between demographic factors. It provides a objective mathematical basis for concluding whether two phenomena are statistically linked or independent.

Symbols

Variables

r = Correlation, = Covariance Sum, = Var Sum X, = Var Sum Y

Correlation
Variable
Covariance Sum
Variable
Var Sum X
Variable
Var Sum Y
Variable

Walkthrough

Derivation

Formula: Product Moment Correlation Coefficient (PMCC)

Pearson’s PMCC r measures the strength and direction of linear association between two variables, ranging from -1 to 1.

  • The relationship is approximately linear.
  • Outliers can strongly affect r.
1

Define the Summary Quantities:

Compute the cross-deviation sum and the sums of squares for x and y.

2

State the PMCC Formula:

Divide covariance-like measure by the product of spreads to standardise the result.

Note: r=1 is perfect positive linear correlation, r=-1 perfect negative, and r=0 no linear correlation.

Result

Source: AQA A-Level Mathematics — Statistics (Bivariate Data)

Why it behaves this way

Intuition

Imagine a scatter plot of data points; the PMCC quantifies how closely these points cluster around a straight line and whether that line slopes upwards (positive correlation) or downwards (negative correlation).

A standardized measure of the strength and direction of a linear relationship between two variables.
Ranges from -1 (perfect negative linear correlation) to +1 (perfect positive linear correlation), with 0 indicating no linear correlation.
Sum of the products of the deviations of each variable from its mean.
Indicates whether variables tend to increase/decrease together (positive) or in opposite directions (negative).
Sum of the squared deviations of the x-variable from its mean.
Represents the total variability or spread within the x-variable data.
Sum of the squared deviations of the y-variable from its mean.
Represents the total variability or spread within the y-variable data.
A normalization factor derived from the individual variabilities of x and y.
Scales the covariance-like term (S_xy) so that the correlation coefficient 'r' is always between -1 and +1, making it a dimensionless measure.

Signs and relationships

  • S_{xy}: The sign of S_xy directly determines the sign of 'r'. A positive S_xy indicates that as one variable increases, the other tends to increase (positive correlation).
  • √(S_{xx)S_{yy}}: This term is always positive because S_xx and S_yy are sums of squares, thus non-negative. It acts as a scaling factor, ensuring that the absolute value of 'r' never exceeds 1, thereby standardizing the measure of linear

Free study cues

Insight

Canonical usage

The Pearson Product-Moment Correlation Coefficient (PMCC) is a dimensionless statistical measure, used to quantify the strength and direction of a linear relationship between two variables, and is reported as a value

Common confusion

A common mistake is attempting to assign a physical unit to the correlation coefficient 'r'. It is a statistical measure of association and is always dimensionless.

Dimension note

The Pearson Product-Moment Correlation Coefficient (PMCC) is a ratio of the covariance of two variables to the product of their standard deviations.

Unit systems

dimensionless - The Pearson Product-Moment Correlation Coefficient itself is inherently dimensionless, representing a standardized measure of linear association.
[unit of X] * [unit of Y] - Represents the sum of products of deviations, where [unit of X] and [unit of Y] are the units of the two variables being correlated. For example, if X is in meters and Y is in kilograms, S_xy would be in meter-kilograms.
[unit of X]^2 - Represents the sum of squared deviations for the first variable. If X is in meters, S_xx would be in meters squared.
[unit of Y]^2 - Represents the sum of squared deviations for the second variable. If Y is in kilograms, S_yy would be in kilograms squared.

One free problem

Practice Problem

A researcher is studying the link between study hours and exam scores. Given the sum of products Sxy = 45, the sum of squares for study hours Sxx = 25, and the sum of squares for exam scores Syy = 100, calculate the correlation coefficient r.

Covariance Sum45
Var Sum X25
Var Sum Y100

Solve for:

Hint: Divide the sum of products by the square root of the product of individual sums of squares.

The full worked solution stays in the interactive walkthrough.

Where it shows up

Real-World Context

In correlation between height and shoe size, Correlation (PMCC) is used to calculate Correlation from Covariance Sum, Var Sum X, and Var Sum Y. The result matters because it helps judge uncertainty, spread, or evidence before making a conclusion from the data.

Study smarter

Tips

  • Always visualize data with a scatter plot first to confirm a linear trend exists.
  • Be wary of outliers, as they can significantly inflate or deflate the value of r.
  • Remember that a correlation of zero implies no linear relationship, but a non-linear relationship may still exist.

Avoid these traps

Common Mistakes

  • Confusing correlation with causation.
  • r > 1 (calculation error).

Common questions

Frequently Asked Questions

Pearson’s PMCC r measures the strength and direction of linear association between two variables, ranging from -1 to 1.

Apply this formula when analyzing paired quantitative data to see if a change in one variable corresponds to a proportional change in another. It is specifically designed for linear associations and assumes that the data is sampled from a bivariate normal distribution.

This coefficient is a cornerstone of predictive modeling, allowing scientists to identify patterns in climate data, economists to hedge risk in financial markets, and sociologists to find links between demographic factors. It provides a objective mathematical basis for concluding whether two phenomena are statistically linked or independent.

Confusing correlation with causation. r > 1 (calculation error).

In correlation between height and shoe size, Correlation (PMCC) is used to calculate Correlation from Covariance Sum, Var Sum X, and Var Sum Y. The result matters because it helps judge uncertainty, spread, or evidence before making a conclusion from the data.

Always visualize data with a scatter plot first to confirm a linear trend exists. Be wary of outliers, as they can significantly inflate or deflate the value of r. Remember that a correlation of zero implies no linear relationship, but a non-linear relationship may still exist.

References

Sources

  1. Wikipedia: Pearson product-moment correlation coefficient
  2. Probability and Statistics for Engineers and Scientists by Walpole, Myers, Myers, Ye (9th Edition)
  3. Moore, David S., and George P. McCabe. Introduction to the Practice of Statistics.
  4. Introduction to the Practice of Statistics by David S. Moore, George P. McCabe, Bruce A. Craig
  5. Statistical Methods for the Social Sciences by Alan Agresti
  6. Wikipedia article "Pearson correlation coefficient
  7. AQA A-Level Mathematics — Statistics (Bivariate Data)