Regression Splines: An Interactive Introduction
Modelling nonlinear age effects in clinical data
A live in-browser tutorial showing why linear and polynomial fits fail for nonlinear clinical predictors, and why natural cubic splines are usually the right tool. Includes a reactive slider for spline degrees of freedom.
The problem
Many clinical predictors do not relate linearly to the outcome. A textbook example is age: systolic blood pressure rises gently in young adulthood, accelerates around midlife, and may flatten in the very old.
Let’s simulate a dataset that captures this pattern and compare three modelling strategies - all running inside your browser.
Attempt 1: a straight line
The residuals are systematically positive in the middle and negative at the tails - a clear misspecification signal.
Attempt 2: polynomials (editable)
A common quick fix is to add higher-order terms. Edit the chunk below: change degree from 3 to 7, 10, 15. Watch the curve oscillate at the boundaries - Runge’s phenomenon in action.
The pedagogical point: polynomials are global - a single tail observation can yank the entire curve.
Attempt 3: natural cubic splines (reactive)
Splines are local: piecewise polynomials joined smoothly at knots. Drag the slider to change the degrees of freedom.