↖︎ Vishal Singh

Part 4 case study

NYC Metro ZIP Health Segments

A factor-analysis and clustering case using ZIP-level health prevalence measures. Demographics are held out of the factor model, then used to interpret what the factors mean.

The two-factor story is strong.

PCA on the health variables says the first two dimensions carry most of the variation. A rotated factor model turns that compression into two readable axes: overall health burden and older-age chronic conditions.

The interpretation comes after scoring.

Income, college share, deprivation, and age are not used to build the factors. Their correlations with the factor scores explain why the map reads as socioeconomic burden crossed with an age/chronic-care dimension.

Factor Score Map

x = age/chronic factor, y = health burden factor

Segment Profile

cluster means

ZIP Coordinate View

centroids, no external map tiles

PCA and Factor Loadings

health variables only

Demographic Readout

correlation after scoring