PsychologyRegression Analysis, Predictive ModelingUniversity
AQAAPOntarioNSWCBSEGCE O-LevelMoECAPS

Logistic Regression (Log-Odds)

Calculates the log-odds of an event occurring based on a linear combination of predictor variables.

Understand the formulaSee the free derivationOpen the full walkthrough

This public page keeps the free explanation visible and leaves premium worked solving, advanced walkthroughs, and saved study tools inside the app.

Core idea

Overview

Logistic regression is a statistical model used to predict the probability of a binary outcome (e.g., yes/no, pass/fail) based on one or more predictor variables. Unlike linear regression, it models the log-odds (logit) of the outcome, which ensures that the predicted probabilities fall between 0 and 1. This transformation allows for the use of linear models for classification problems, making it a powerful tool in fields like psychology for predicting categorical behaviors or states.

When to use: Apply this formula when you need to model the relationship between a set of independent variables and a dichotomous (binary) dependent variable. It's used to predict the probability of an event occurring, such as predicting whether a student will pass an exam or if a patient will respond to a treatment.

Why it matters: Logistic regression is fundamental for understanding and predicting categorical outcomes in many scientific and practical domains. In psychology, it helps researchers identify factors influencing decisions, diagnoses, or behavioral choices. Its ability to provide probabilities makes it invaluable for risk assessment, intervention planning, and developing predictive models for real-world applications.

Symbols

Variables

p = Probability of Event, = Intercept, = Coefficient for x1, = Predictor Variable 1, = Coefficient for xk

Probability of Event
dimensionless
Intercept
dimensionless
Coefficient for x1
dimensionless
Predictor Variable 1
units
Coefficient for xk
dimensionless
Predictor Variable k
units

Walkthrough

Derivation

Formula: Logistic Regression (Log-Odds)

This formula expresses the log-odds of a binary outcome as a linear combination of predictor variables.

  • The dependent variable is binary (dichotomous).
  • Observations are independent.
  • There is a linear relationship between the independent variables and the log-odds of the dependent variable.
  • No multicollinearity among independent variables.
1

Define the Probability and Odds:

Let `p` be the probability that the dependent variable `Y` equals 1 (the event of interest). The probability of `Y` equaling 0 is then `1-p`.

2

Step

The odds of an event occurring are defined as the ratio of the probability of the event occurring to the probability of it not occurring.

3

Introduce the Logit Transformation:

To transform the odds, which range from 0 to infinity, into a variable that ranges from negative infinity to positive infinity, we take the natural logarithm. This is known as the logit transformation.

4

Model the Logit as a Linear Function:

The core idea of logistic regression is to model this logit (log-odds) as a linear combination of the independent variables (``) and their respective coefficients (`β_i`), plus an intercept (`β₀`). This linear predictor can then be transformed back to a probability.

Note: The coefficients `β_i` represent the change in the log-odds for a one-unit increase in ``, holding other variables constant.

Result

Source: Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). SAGE Publications. Chapter 19: Logistic Regression.

Free formulas

Rearrangements

Solve for

Logistic Regression: Make p the subject

To make (Probability of Event) the subject, convert the log-odds back to odds, then solve for .

Difficulty: 3/5

Solve for

Logistic Regression: Make β₀ the subject

To make (Intercept) the subject, subtract the sum of all other predictor terms from the log-odds.

Difficulty: 1/5

Solve for

Logistic Regression: Make β₁ the subject

To make (Coefficient for x₁) the subject, isolate the term containing and then divide by .

Difficulty: 2/5

Solve for

Logistic Regression: Make x₁ the subject

To make (Predictor Variable 1) the subject, isolate the term containing and then divide by .

Difficulty: 2/5

Solve for

Logistic Regression: Make βₖ the subject

To make (Coefficient for xₖ) the subject, isolate the term containing and then divide by .

Difficulty: 2/5

Solve for

Logistic Regression: Make xₖ the subject

To make (Predictor Variable k) the subject, isolate the term containing and then divide by .

Difficulty: 2/5

The static page shows the finished rearrangements. The app keeps the full worked algebra walkthrough.

Visual intuition

Graph

Graph unavailable for this formula.

The graph is a straight line where the slope is defined by the coefficient beta1, representing a constant rate of change in log-odds as the predictor variable x1 increases. For a psychology student, this linear relationship means that larger values of x1 consistently shift the predicted log-odds in one direction, indicating a steady increase or decrease in the likelihood of a psychological event occurring. The most important feature of this curve is that the linear relationship allows for a direct, proportional change in log-odds for every unit increase in x1, regardless of the starting value.

Graph type: linear

Why it behaves this way

Intuition

Imagine an S-shaped curve (sigmoid function) that maps a linear combination of predictor variables to a probability between 0 and 1, representing the likelihood of a binary outcome.

The probability of the binary outcome being '1' (e.g., success, presence of a condition).
A higher value means the event is more likely to occur.
p / (1-p)
The odds of the event occurring.
Quantifies how many times more likely the event is to happen than not to happen.
The log-odds (or logit) of the event occurring.
A transformed scale for the odds that ranges from negative to positive infinity, allowing for linear modeling.
The intercept term, representing the log-odds of the event when all predictor variables are zero.
The baseline log-odds of the outcome before considering the influence of any predictors.
The coefficient for predictor variable x_i, representing the change in the log-odds of the event for a one-unit increase in x_i, holding other predictors constant.
Indicates the direction and strength of the influence of on the likelihood of the event.
An independent (predictor) variable.
A measurable factor whose value is used to predict the probability of the outcome.

Signs and relationships

  • \ln(...): The natural logarithm transformation maps the odds (which range from 0 to infinity) into a linear scale (from negative infinity to positive infinity), making the dependent variable suitable for a linear model and
  • \beta_i: A positive indicates that as increases, the log-odds of the event occurring increase (and thus the probability p increases). A negative indicates the opposite effect.

Free study cues

Insight

Canonical usage

The equation is used to model the log-odds (logit) of a binary outcome as a linear function of one or more predictors, where the resulting logit is a dimensionless quantity.

Common confusion

Mistaking the regression coefficient (beta) for a direct change in probability (p) rather than a change in the log-odds.

Dimension note

The logit function transforms a probability ratio into a value on the real number line. Because it is the natural logarithm of a ratio of two probabilities (the odds), the left-hand side of the equation is strictly

Unit systems

dimensionless · Probability of the event occurring, ranging from 0 to 1.
1-pdimensionless · Probability of the event not occurring (complementary probability).
dimensionless · The intercept representing the log-odds when all predictor variables are zero.
1/[x_i] · The regression coefficient for predictor x_i; its units are the reciprocal of the units of x_i to ensure the product is dimensionless.

One free problem

Practice Problem

A logistic regression model predicts the likelihood of a student passing an exam. The intercept (β₀) is -2.5. For every hour of study (x₁), the coefficient (β₁) is 0.8. If a student studies for 4 hours, what is the log-odds of them passing the exam?

Intercept-2.5 dimensionless
Coefficient for x10.8 dimensionless
Predictor Variable 14 units

Solve for:

Hint: Calculate the linear predictor (right side of the equation).

The full worked solution stays in the interactive walkthrough.

Where it shows up

Real-World Context

Predicting the likelihood of a patient developing a specific mental health condition based on demographic factors and symptom severity scores.

Study smarter

Tips

  • The left side, `ln(p/(1-p))`, is called the log-odds or logit function.
  • The right side, `β₀ + β₁x₁ + ... + βₖxₖ`, is the linear predictor, often denoted as `L` or `η`.
  • To convert log-odds back to probability `p`, use the inverse logit function: `p = 1 / (1 + e^(-L))`.
  • Interpret `β` coefficients as the change in log-odds for a one-unit increase in the corresponding `x` variable, holding others constant.

Avoid these traps

Common Mistakes

  • Interpreting `β` coefficients directly as changes in probability, rather than changes in log-odds.
  • Assuming a linear relationship between predictors and the probability itself, instead of the log-odds.
  • Not checking for multicollinearity among predictor variables, which can inflate standard errors of coefficients.

Common questions

Frequently Asked Questions

This formula expresses the log-odds of a binary outcome as a linear combination of predictor variables.

Apply this formula when you need to model the relationship between a set of independent variables and a dichotomous (binary) dependent variable. It's used to predict the probability of an event occurring, such as predicting whether a student will pass an exam or if a patient will respond to a treatment.

Logistic regression is fundamental for understanding and predicting categorical outcomes in many scientific and practical domains. In psychology, it helps researchers identify factors influencing decisions, diagnoses, or behavioral choices. Its ability to provide probabilities makes it invaluable for risk assessment, intervention planning, and developing predictive models for real-world applications.

Interpreting `β` coefficients directly as changes in probability, rather than changes in log-odds. Assuming a linear relationship between predictors and the probability itself, instead of the log-odds. Not checking for multicollinearity among predictor variables, which can inflate standard errors of coefficients.

Predicting the likelihood of a patient developing a specific mental health condition based on demographic factors and symptom severity scores.

The left side, `ln(p/(1-p))`, is called the log-odds or logit function. The right side, `β₀ + β₁x₁ + ... + βₖxₖ`, is the linear predictor, often denoted as `L` or `η`. To convert log-odds back to probability `p`, use the inverse logit function: `p = 1 / (1 + e^(-L))`. Interpret `β` coefficients as the change in log-odds for a one-unit increase in the corresponding `x` variable, holding others constant.

References

Sources

  1. Wikipedia: Logistic regression
  2. Applied Logistic Regression by David W. Hosmer Jr., Stanley Lemeshow, and Rodney X. Sturdivant
  3. An Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani
  4. Applied Logistic Regression (Hosmer, Lemeshow, & Sturdivant)
  5. Statistical Methods for Psychology (Howell)
  6. Discovering Statistics Using IBM SPSS Statistics (Field)
  7. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning with Applications in R. Springer.
  8. Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied Logistic Regression (3rd ed.). Wiley.