Simple Linear Regression Equation
Models the linear relationship between a dependent variable and a single independent variable.
This public page keeps the free explanation visible and leaves premium worked solving, advanced walkthroughs, and saved study tools inside the app.
Core idea
Overview
Simple linear regression is a statistical method used to predict the value of a dependent variable (Y) based on the value of a single independent variable (X). It fits a straight line (the regression line) to the observed data, minimizing the sum of squared residuals. The equation provides the intercept (b₀) and the slope (b₁), which quantify the predicted change in Y for a one-unit change in X.
When to use: Applied when a researcher wants to understand or predict a continuous outcome variable based on a single continuous predictor. Common in studies examining the impact of education on income, age on political attitudes, or social capital on health outcomes.
Why it matters: Fundamental for understanding causal pathways and making predictions in social science. It allows sociologists to quantify the strength and direction of relationships, control for other variables (in multiple regression), and test theoretical hypotheses about social processes and inequalities.
Symbols
Variables
= Intercept, = Slope, X = Independent Variable, Ŷ = Predicted Dependent Variable
Walkthrough
Derivation
Formula: Simple Linear Regression Equation
Defines the linear model for predicting a dependent variable from an independent variable.
- Linear relationship between X and Y.
- Independent observations.
- Homoscedasticity (constant variance of residuals).
- Normally distributed residuals.
Define the linear model:
The true population model where Y is a linear function of X, plus an error term. We estimate this with sample data.
Estimate with sample data:
The estimated regression line, where `` is the predicted value, `` is the estimated intercept, and `` is the estimated slope, derived using the method of Ordinary Least Squares (OLS).
Result
Source: Legendre, A. M. (1805). Nouvelles méthodes pour la détermination des orbites des comètes. Paris: Courcier.
Free formulas
Rearrangements
Solve for
Make b₀ the subject of the Simple Linear Regression Equation
Rearrange the simple linear regression equation to solve for the intercept, b₀.
Difficulty: 2/5
Solve for
Make b₁ the subject of the Simple Linear Regression Equation
Rearrange the simple linear regression equation to solve for the slope, b₁.
Difficulty: 2/5
Solve for
Make X the subject of the Simple Linear Regression Equation
Rearrange the simple linear regression equation to solve for the independent variable, X.
Difficulty: 2/5
The static page shows the finished rearrangements. The app keeps the full worked algebra walkthrough.
Visual intuition
Graph
Graph unavailable for this formula.
The graph is a straight line where the output changes at a constant rate as the independent variable increases. For a sociology student, this shape implies that a unit change in the independent variable consistently predicts the same shift in the dependent variable, regardless of whether the independent variable is at a small or large value. The most important feature of this linear relationship is that the constant rate of change remains uniform across the entire range of the independent variable.
Graph type: linear
Why it behaves this way
Intuition
A straight line drawn through a scatter plot of data points, representing the best linear fit that minimizes the sum of squared vertical distances (residuals)
Signs and relationships
- b_1: The sign of indicates the direction of the linear relationship between X and Y. A positive means increases as X increases (a positive association), while a negative means decreases as X
Free study cues
Insight
Canonical usage
Units of variables are preserved through the regression, with coefficients inheriting units derived from the dependent and independent variables.
Common confusion
A common mistake is misinterpreting the units of the slope coefficient (b1), especially when X or Y are percentages or scores. The unit of b1 is 'units of Y per unit of X', not a percentage change unless explicitly
Dimension note
While individual variables (e.g., scores, proportions) may be dimensionless, the equation itself establishes a relationship between quantities whose 'units' (whether physical, monetary, or abstract scores)
Unit systems
One free problem
Practice Problem
A regression model predicts an individual's political participation (Ŷ) based on their age (X). The intercept (b₀) is 5, and the slope (b₁) is 3. What is the predicted political participation score for an individual who is 10 years old?
Solve for:
Hint: Substitute the given values into the regression equation: Ŷ = b₀ + b₁X.
The full worked solution stays in the interactive walkthrough.
Where it shows up
Real-World Context
A sociologist uses simple linear regression to predict an individual's level of social trust based on their reported level of community engagement.
Study smarter
Tips
- The slope (b₁) indicates the average change in Y for a one-unit increase in X.
- The intercept (b₀) is the predicted value of Y when X is zero, but only interpretable if X=0 is meaningful.
- Always check regression assumptions (linearity, independence, homoscedasticity, normality of residuals).
- Regression models predict, but do not prove, causation without careful research design.
Avoid these traps
Common Mistakes
- Extrapolating beyond the range of the observed data.
- Assuming causation without experimental design.
- Ignoring violations of regression assumptions.
Common questions
Frequently Asked Questions
Defines the linear model for predicting a dependent variable from an independent variable.
Applied when a researcher wants to understand or predict a continuous outcome variable based on a single continuous predictor. Common in studies examining the impact of education on income, age on political attitudes, or social capital on health outcomes.
Fundamental for understanding causal pathways and making predictions in social science. It allows sociologists to quantify the strength and direction of relationships, control for other variables (in multiple regression), and test theoretical hypotheses about social processes and inequalities.
Extrapolating beyond the range of the observed data. Assuming causation without experimental design. Ignoring violations of regression assumptions.
A sociologist uses simple linear regression to predict an individual's level of social trust based on their reported level of community engagement.
The slope (b₁) indicates the average change in Y for a one-unit increase in X. The intercept (b₀) is the predicted value of Y when X is zero, but only interpretable if X=0 is meaningful. Always check regression assumptions (linearity, independence, homoscedasticity, normality of residuals). Regression models predict, but do not prove, causation without careful research design.
References
Sources
- Discovering Statistics Using IBM SPSS Statistics
- Wikipedia: Simple linear regression
- Andy Field, Discovering Statistics Using R and RStudio, 2012, SAGE Publications
- Alan Agresti, Statistical Methods for the Social Sciences, 5th ed., 2018, Pearson
- Wikipedia: Linear regression
- Andy Field Discovering Statistics Using R and RStudio
- John Neter, Michael H. Kutner, Christopher J. Nachtsheim, William Wasserman Applied Linear Regression Models
- Alan Agresti, Barbara Finlay Statistical Methods for the Social Sciences