Sentiment, counted
lexiconSentiment over time
VADER compound, averaged per period, by group
Average tone by group
Mean compound · share clearly positive vs negative
Distinctive language
weighted log-oddsFightin' words: terms most over-represented in each group after a Dirichlet prior shrinks rare-word noise. The group is encoded twice — by color, and by which side of the zero line a term sits on.
Features pulled from the text
feature engineeringEvery feature is a deterministic property of the raw text — length, punctuation, capitalization, links, sentiment. These exact columns feed the classifier below, so anything it learns stays auditable.
Feature fingerprint by group
Group means, each feature scaled to its own range · hover for raw values
When are they posted?
Share of units by hour of day
Can features tell the groups apart?
logistic regressionAccuracy by representation
Held-out test accuracy vs majority baseline
ROC curve
Confusion matrix
Held-out predictions, headline model
What the model keys on
Largest positive logistic coefficients toward each class
Topics, discovered
LDALatent Dirichlet Allocation clusters co-occurring words into topics. The model finds structure; a human still has to name it. The instability panel reruns LDA with a second random seed on the same corpus.
Topic prevalence by group
Average topic share within each group · darker = more present
Seed (in)stability
Same data, two seeds · similar themes, different word lists
LLM measurement
GABRIELConstruct ratings by group
Mean 0–100 on constructs named before reading any output
Run cost
Dominant frame mix by group
Share of units whose first applicable frame label is each category
Do the methods agree?
triangulationPer-unit outputs lined up and correlated. Agreement is reassuring, not proof; disagreement is where the assumptions live.
Cross-method correlation
Pearson correlation across per-unit method outputs