Factor Crowdedness and Spread Compression

A factor becomes "crowded" when enough capital is chasing it that the very act of chasing erodes the edge. The earliest visible symptom is the top-vs-bottom quintile return spread compressing over time. The FM103 Crowdedness sub-pill measures this with a per-factor R² regression of the rolling spread against time. A factor with a strongly negative slope and R² ≥ 0.20 is flagged as crowding.

Why crowding matters

A new factor delivers a return spread because few participants are positioned for it. As the factor enters mainstream practice (academic publication, ETF launch, quant strategy proliferation), capital concentrates in the long leg, demand pushes prices up, and the prospective spread shrinks. McLean & Pontiff (2016, JF) documented this for 97 published factors: average post-publication decay of ~58% in the cross-sectional return.

Crowding is the structural complement to factor decay (see Factor Decay). Decay describes how quickly a fresh signal weakens at any moment; crowding describes how the factor's mean strength changes over the backtest period.

The crowdedness measurement

For each factor and each rebalance period t, compute the realised top-minus-bottom quintile spread S_t. Then regress:

S_t = α + β · t + ε_t

The slope β and the R² together describe the crowding profile. The verdict logic:

β negative, R² ≥ 0.20: "Crowding" — the spread is systematically narrowing.
β positive, R² ≥ 0.20: "Widening" — rare; usually a sign the factor is becoming more useful, perhaps due to capital rotation away from it.
R² < 0.20: "Stable" — the spread varies but without a clear trend.

What R² thresholds mean

An R² of 0.20 in a linear time regression with 20–40 periods is moderately strong evidence of a trend (corresponding to a t-statistic on β near 2). Below 0.20, the slope is real but the variance around it dominates — the trend is washed out by period noise. Above 0.50, the trend is the dominant story and per-period variance is secondary.

Sector drift as a secondary signal

The Crowdedness sub-pill also tracks sector drift: the fraction of the portfolio's sector composition that changes from period to period. Rising sector drift while spread is compressing is a particularly bad combination — it means the factor is rotating into sectors to chase the diminishing edge. This is the signature of late-stage factor exhaustion.

Persistence HHI

Crowdedness also reports persistence HHI on the long leg — a Herfindahl index of how often the same names show up in the top quintile across periods. High persistence (HHI > 0.4) means the same handful of names dominate the long leg every period. This is benign for slow-decay value factors (Berkshire is always cheap on some metric) but red-flag for momentum (momentum names should rotate).

How to act on crowdedness signals

If a factor is crowding: Either accept the lower expected spread going forward, blend with a less-crowded factor to dilute the effect, or rotate to a fresher signal.
If sector drift is high: Investigate which sectors are being added or removed. A factor that started as quant value and ended up as quant tech-momentum has changed its nature without you knowing.
If persistence HHI is rising: The factor is becoming less differentiating. Top-quintile selection collapses to the same names regardless of period — you've effectively created a buy-and-hold strategy on a small set of names.

Crowdedness vs. decay vs. regime dependence

Three structural threats to a factor strategy:

Decay — signal weakens within a holding period (intra-period erosion). Fix: faster rebalance.
Crowding — spread declines across the backtest (inter-period erosion). Fix: rotate factors.
Regime dependence — factor only works in some regimes. Fix: regime overlay (see Regime-Conditional).

The three are independent. A factor can be decay-immune (slow half-life) yet crowding-doomed (spread compressing). Always check all three.

Backtests look healthier than reality on the crowding axis

Backtests run on factors that "worked in the past" by construction. The historical record selects for non-crowded periods because the factor wasn't crowded for most of the sample. A factor with a crowding verdict in the backtest is signalling that the late-sample is already showing the post-publication decay McLean & Pontiff documented. Live results will be closer to the late-sample slope, not the full-period average.

Factor Crowdedness and Spread Compression

Why crowding matters

The crowdedness measurement

What R² thresholds mean

Sector drift as a secondary signal

Persistence HHI

How to act on crowdedness signals

Crowdedness vs. decay vs. regime dependence

Further Reading

Foundational papers

Textbook references

Related QuanterLab articles

Try it in QuanterLab