↖︎ Vishal Singh
The ZIP Code Destiny·Data Story № 12

An Index Autopsy

Half this series leans on one number — the Area Deprivation Index. So we took it apart and rebuilt a rival from scratch: a PCA of nine census measures with housing costs deliberately left out. The two indices agree about most of America, and disagree by 50 percentiles about the Bronx.

The ZIP Code Destiny · PCA on 9 ACS measures × 84,000 tracts · June 2026
51%
of variance on the first component
r = 0.68
ZCD index vs published ADI
+54
Bronx divergence, percentiles
≈ tie
horse race predicting life expectancy

Every composite index is a theory wearing a number's clothing. The ADI's theory, inherited from Gopal Singh's 1990s formulation, includes housing values and rents among its seventeen inputs: expensive homes signal advantage. In most of America that is true. In New York, San Francisco, and Los Angeles it breaks catastrophically — a poor family paying $2,400 for a crowded apartment scores as "advantaged."

Principal component analysis lets us rebuild the index with a different theory and almost no hand-tuning: throw nine ACS measures of income, education, employment, and family structure into the machine — no housing costs — and let the data find the single axis that explains the most variation. We call the result the ZCD index. The autopsy compares the two.

01

What the machine chose

The first component absorbs 51 percent of the variance across nine measures, and its loadings need no massaging to interpret: poverty, public assistance, unemployment, low education, and single-parent households pull one way; income, college, homeownership, and labor-force participation pull the other. Disadvantage, the data agree, is one axis.

First-component loadings. PCA of nine standardized ACS tract measures (2019–2023), oriented so positive = more deprived. scikit-learn PCA; 84,000 tracts.
02

Two indices, one disagreement

Plot every tract's ADI percentile against its ZCD percentile. The diagonal mass is the 0.68 correlation — for most of the country the two theories of deprivation agree. The rebellious cloud above the diagonal is dominated by expensive-housing-market tracts (red: ZIP median home value over $450k): places the ZCD calls deprived and the ADI, looking at their rents, calls comfortable.

ADI vs. ZCD percentile, 4,500-tract sample. Red: tracts whose ZIP-area median home value exceeds $450k (expensive housing markets). Sources: Neighborhood Atlas; ACS PCA; ACS home values.
03

The map of the disagreement

Average the divergence by county and the quarrel acquires a geography. The ADI under-calls deprivation (red) in exactly the places this series kept tripping over: the Bronx (+54 percentiles), Brooklyn (+52), urban California. It over-calls deprivation (blue) across cheap-housing metros — upstate New York, Pittsburgh, suburban St. Louis — where low rents read as poverty that the labor market data don't corroborate.

ADI overstatesADI understates
ZCD minus ADI percentile, county average (population- weighted; counties with ≥8 tracts). Red = our housing-free index says the county is more deprived than ADI does. Sources: as above.
04

The horse race — and the honest verdict

Which index better predicts life expectancy? Nationally it is essentially a tie (R² = 0.36 vs 0.37). Within counties, ADI's slope per 10 percentiles is steeper — partly because nationally-ranked percentiles compress differently inside metros. The fair conclusion is not that one index wins; it is that the gradient is robust to the theory of deprivation, while the ranking of specific places is not. Use either index for national gradients; use neither uncritically in expensive cities — and never let a single composite carry a city-level claim alone.

Predicting tract life expectancy. National R² and the within-county slope per 10 percentiles (95% CI), each index in turn. pyfixest, population-weighted, county-clustered SEs.

Notes & data