Cointegration & Pairs Trading

Two stocks that share a fundamental driver (industry, geography, business model) tend to move together, but their prices drift apart and back over short horizons. Cointegration is the formal statistical concept that says: even though each individual price wanders, a specific linear combination of them is stationary. That stationary combination is the spread you trade.

The Definition

Two non-stationary price series X and Y are cointegrated if there exists a coefficient β such that Y − β × X is stationary. Stationary here means: it has a stable mean, finite variance, and reverts to that mean over time. Either price alone is non-stationary (random walk-like), but the linear combination is.

This is stronger than correlation. Two correlated stocks may both trend up together but drift apart over time; their spread is non-stationary. Two cointegrated stocks have a long-run equilibrium — when their spread deviates, it tends to come back.

Correlation vs Cointegration

Correlation measures whether two series move together day-to-day. Cointegration measures whether they share a long-run equilibrium. Two stocks can be highly correlated and not cointegrated (they trend together but drift apart). Two stocks can be cointegrated and not particularly correlated (they have offsetting short-term moves but a stable long-run relationship). Pairs trading needs cointegration, not just correlation.

Testing for Cointegration

Engle-Granger Two-Step

The simplest test:

  1. Regress Y on X to estimate β: Y = α + β × X + ε.
  2. Test the residuals ε for stationarity using ADF.
  3. If residuals are stationary (reject null), the pair is cointegrated.

Strengths: simple, fast, gives β directly. Weaknesses: assumes one-directional causality (Y depends on X) and is sensitive to which variable is dependent. For symmetric pairs, prefer Johansen.

Johansen Test

Tests cointegration in a multivariate VECM (vector error-correction model) framework. Outputs the number of cointegrating relationships and their coefficients, treating both variables symmetrically. More robust for true pairs, more expensive computationally.

QuanterLab's pairs scanner runs Engle-Granger with ADF on residuals as the primary test, with a configurable p-value threshold. Pairs that pass the test become candidates; pairs that fail are filtered out.

The Trade Construction

Once you have a cointegrated pair (X, Y) with hedge ratio β:

  1. Compute the spread: S(t) = Y(t) − β × X(t).
  2. Compute its rolling mean and standard deviation, or fit OU directly to S.
  3. Enter long-spread (long Y, short β × X) when S is far below its mean (e.g., Z-score < −2).
  4. Enter short-spread when S is far above its mean.
  5. Exit when S returns to mean (Z-score crosses 0).

This is the canonical pairs-trading recipe. Position sizing is typically dollar-neutral: equal dollar amounts of long Y and short β × X, so net market exposure is zero.

The Quality Checks That Matter

A Pair Worth Trading Has...
  • Engle-Granger p-value < 0.05. Statistical evidence of cointegration.
  • Half-life of the spread between 5 and 50 bars. Fast enough to trade, slow enough to be real.
  • Stable β across rolling windows. If β drifts wildly, the relationship is unstable; the pair is fragile.
  • Economic plausibility. Two oil majors, two regional banks, two ETFs tracking similar indices — the cointegration has a story. Two random tickers that test cointegrated by chance are usually false positives.
  • Robustness across sub-periods. Cointegration that holds in 2018–2020 but not 2021–2023 is regime-dependent and dangerous.

Failure Modes

  • Structural break. A merger, spinoff, or fundamental shift can permanently change the relationship. Pairs that were cointegrated stop being cointegrated and the spread no longer reverts. Painful.
  • Look-back bias. Testing many pairs and trading only those that pass cointegration is a multiple-testing problem. Out of 1,000 random pairs, ~50 will appear cointegrated by chance at p < 0.05. Use Bonferroni correction or DSR-style adjustment.
  • Liquidity asymmetry. If one leg is much less liquid, the spread you can actually trade differs from the backtested spread.
  • Hedge-ratio drift. A static β estimated on historical data may not be the right β today. Kalman-filter-based pairs trading addresses this directly.

QuanterLab's Pairs Workflow

Use the Pairs Scanner (within SC001STCB) to filter universes of pairs by cointegration p-value, half-life, β stability, and correlation. Each candidate produces a spread chart, a Z-score chart, and a robustness scan over entry/exit thresholds. Walk-forward the chosen parameters before saving the pair as a tradable strategy.

The Bottom Line

Pairs trading is one of the most academically-validated quant strategies — and one of the most p-hacked in retail practice. The math is real; the discipline of avoiding spurious pairs (multiple-testing correction, economic plausibility, walk-forward) is what separates trades that work from artifacts of search.

Further Reading

Foundational papers

  • Engle, R. F. & Granger, C. W. J. (1987). Co-integration and Error Correction: Representation, Estimation, and Testing. Econometrica, 55(2), 251–276.
  • Johansen, S. (1988). Statistical Analysis of Cointegration Vectors. Journal of Economic Dynamics and Control, 12(2–3), 231–254.
  • Gatev, E., Goetzmann, W. N. & Rouwenhorst, K. G. (2006). Pairs Trading: Performance of a Relative-Value Arbitrage Rule. Review of Financial Studies, 19(3), 797–827.
  • Avellaneda, M. & Lee, J.-H. (2010). Statistical Arbitrage in the U.S. Equities Market. Quantitative Finance, 10(7), 761–782.

Textbook references

  • Hamilton, J. D. (1994). Time Series Analysis. Princeton University Press.
  • Tsay, R. S. (2010). Analysis of Financial Time Series (3rd ed.). Wiley.
  • Chan, E. P. (2013). Algorithmic Trading: Winning Strategies and Their Rationale. Wiley.

Related QuanterLab articles

Try it in QuanterLab

Use the Pairs Scanner in SC001STCB on an industry universe (e.g., S&P 500 banks). The shortlist of cointegrated pairs is your candidate pool; verify each candidate has economic plausibility before trading — random co-mover pairs that pass cointegration by chance are the most common trap.

Back to QuanterLab
Report
Loading report...
Article
Loading article...