Not surprised at all. Rage bait has been leading to clicks since before the dawn of AOL.
About that data though, just publish that. Throw the data and tooling up on github or huggingface if it's a massive dataset. Would be interested in comparing methodologies for deriving sentiment.
159 stories that hit score 100 in my tracking, with HN points, comments, and first-seen timestamp.
Methodology:
- Snapshots every 30 minutes (1,576 total)
- Filtered to score=100 (my tracking cap)
- Deduped by URL, kept first occurrence
- Date range: Dec 2025 - Jan 2026
For sentiment, I ran GPT-4 on the full article text with a simple positive/negative/neutral classification. Not perfect but consistent enough to see the 2:1 pattern.
Thought about this during the morning. I'm run the posts through ministral3:3b, mistral-small3.2:24b, and gpt-oss:20b this weekend to build a sentiment mapping and see what I get. I'm optimistic about ministral3:3b, but the other two are pretty good at this type of stuff.
You're right that most voting is headline-driven - that's definitely a limitation worth calling out.
I went with full article text because I wanted to capture what the content actually delivers, not just what the headline promises. A clickbait negative headline with a balanced article would skew results if I only looked at titles.
That said, you've got me thinking. It would be interesting to run sentiment on headlines separately and compare. If headline sentiment correlates strongly with article sentiment, your point stands. If they diverge, there might be something interesting about the gap between promise and delivery.
Might be a good follow-up analysis. Thanks for pushing on this.
Not surprised at all. Rage bait has been leading to clicks since before the dawn of AOL.
About that data though, just publish that. Throw the data and tooling up on github or huggingface if it's a massive dataset. Would be interested in comparing methodologies for deriving sentiment.
Here's the raw dataset: https://asof.app/static/hn_viral_dataset.json
159 stories that hit score 100 in my tracking, with HN points, comments, and first-seen timestamp.
Methodology: - Snapshots every 30 minutes (1,576 total) - Filtered to score=100 (my tracking cap) - Deduped by URL, kept first occurrence - Date range: Dec 2025 - Jan 2026
For sentiment, I ran GPT-4 on the full article text with a simple positive/negative/neutral classification. Not perfect but consistent enough to see the 2:1 pattern.
Thought about this during the morning. I'm run the posts through ministral3:3b, mistral-small3.2:24b, and gpt-oss:20b this weekend to build a sentiment mapping and see what I get. I'm optimistic about ministral3:3b, but the other two are pretty good at this type of stuff.
Interesting, would love to see the results. I'll be checking back here if you care to share them.
Yeah, I plan to do a follow up comment with data and results.
It’s unfortunate that you came up with these stats and then didn’t use that info to tune the title for this post.
Lol fair. I chickened out on the rage-bait title because it felt too meta.
If this dies in /new, at least I proved my own point.
Related:
65% of Hacker News posts have negative sentiment, and they outperform
https://news.ycombinator.com/item?id=46512881
If people don't like something, I think the motivation to act is more than if they do like something or are neutral. Human nature.
People will downvote a headline with positive comments on something they don't like.
But what do they do with a negative headline about something they don't like ? I guess they will upvote it to show they also don't like it.
So negative wins.
"ran GPT-4 sentiment analysis on full article text." I think most people vote based on headlines, not on article text.
You're right that most voting is headline-driven - that's definitely a limitation worth calling out.
I went with full article text because I wanted to capture what the content actually delivers, not just what the headline promises. A clickbait negative headline with a balanced article would skew results if I only looked at titles.
That said, you've got me thinking. It would be interesting to run sentiment on headlines separately and compare. If headline sentiment correlates strongly with article sentiment, your point stands. If they diverge, there might be something interesting about the gap between promise and delivery.
Might be a good follow-up analysis. Thanks for pushing on this.