The Slope Factory

Story № 2 showed the master gradient: across all of America's census tracts, each step up the deprivation index costs about a month of life expectancy. A skeptic has a fair objection ready. National lines can be manufactured by composition — poor Mississippi and rich Minnesota sitting at opposite ends of one chart, with no neighborhood mechanism at all.

The remedy is a habit worth stealing from applied econometrics: many small models instead of one large one. Fit the same regression separately inside every metro, where state policy, climate, health systems, and history are held roughly constant. Then look at the distribution of coefficients. If they scatter around zero, the national line was composition. If they all point the same way, you have found something closer to a law.

First, the ladder

Before going local, tighten the screws on the national estimate. Raw, the slope is −0.83 years per 10 deprivation percentiles. Compare tracts only within the same state and it steepens to −0.91. Compare only within the same county — the strictest test, where everything from the hospital system to the weather is shared — and it steepens again, to −1.11. The gradient is not a between-region artifact. It is sharpest exactly where the skeptic said it should vanish.

The fixed-effects ladder. Population-weighted regressions of tract life expectancy on ADI percentile; 95% intervals (clustered by state / county). Estimated with pyfixest; ~79,000 tracts.

266 regressions, one sign

Now the factory floor. Each row below is one metro's own regression — its slope and 95% interval, estimated only from its own tracts. The histogram on top shows where the 266 slopes pile up. None of them — not one — lands on the right side of zero, and 98.9 percent are significantly negative on their own. The deprivation gradient is not a national average that some lucky cities escape. It is the operating physics of every American metro we can measure.

The slope distribution. Each dot: one metro's regression of tract life expectancy on ADI percentile (years per 10 percentiles), weighted by tract population; whiskers are 95% intervals. Metros with ≥40 measurable tracts. Hover any dot. Sources: USALEEP; Neighborhood Atlas ADI; ACS.

The spread refuses the easy stories

Generalization established, the interesting question becomes the spread: why does a deprivation percentile cost 2.4 years in San Francisco–Oakland and 0.5 in Madison, Wisconsin? Here the second-stage regression delivers a finding by failing. Metro income explains essentially none of it (r ≈ 0.00 — the chart below is a cloud). Our segregation proxy — how widely deprivation is dispersed across a metro's tracts — explains almost nothing once weighted properly. Metro size, nothing. The sign of the gradient is a law; its steepness is a stubborn local fact that the obvious metro characteristics do not predict. That null is worth a chapter: whatever sets the local exchange rate between deprivation and death, it is not visible in the census — which points the search at the things censuses miss: health systems, housing stock, drug markets, history.

Slope vs. metro median income — the null result. Each dot one metro, sized by population. No fit line is drawn because there is no fit to draw (weighted r = 0.07). Second-stage view of the 266 metro coefficients.

Why this method earns its keep

"Many small models" is more than a robustness check. It converts a modeling assumption — the relationship is the same everywhere — into a measured quantity with its own distribution, its own outliers, and its own second-stage questions. When the 266 coefficients agree on the sign, the finding generalizes and composition stories die. When they disagree on magnitude, the disagreement becomes data. The rest of this series will run several more findings through the same factory; the gradient is simply the first to survive intact.

Notes & data

Specification. Within each CBSA (tract → primary ZCTA → CBSA crosswalk): weighted least squares of USALEEP life expectancy on ADI national percentile, weights = ACS tract population, classical SEs; reported per 10 percentiles. Metros require ≥40 tracts with both measures; 266 qualify, covering ~83% of the measured population.
The ladder uses pyfixest (`le ~ adi | state_fips` / `| county_fips`), population-weighted, cluster-robust SEs at the FE level. The within-county estimate (−1.11) is the book's preferred citation because 69% of tract LE variance lies within counties.
Meta-regression. Metro slope on metro ADI dispersion, log median income, and log population, weighted by inverse squared SE. All three predictors are weak and sign-unstable across specifications (unweighted r with income ≈ 0.00; with ADI dispersion +0.18 unweighted, ≈ 0 weighted). We report the null rather than curve-fit a story; the conditional income coefficient that looked promising (−0.19) does not survive reweighting and is not quoted in the text.
Caveats. Slopes are descriptive sorting-plus-causation, not treatment effects; ADI percentiles are national (a metro's internal range varies, which is partly why slopes differ — the meta-regression controls for dispersion to address exactly this); LE standard errors attenuate slopes toward zero, more so in small metros.
Provenance. Estimated by `articles/_build/prep_data_c.py`; every coefficient in the figure is reproducible from the staged lake.