I scraped 1,576 HN snapshots and found 159 stories that hit the maximum score. Then I crawled the actual articles and ran sentiment analysis.
The results surprised me.
*The Numbers*
- Negative sentiment: 78 articles (49%)
- Positive sentiment: 45 articles (28%)
- Neutral: 36 articles (23%)
Negative content doesn't just perform well – it dominates.
*What "Negative" Actually Means*
The viral negative posts weren't toxic or mean. They were:
- Exposing problems ("Why I mass-deleted my Chrome extensions")
- Challenging giants ("OpenAI's real business model")
- Honest failures ("I wasted 3 years building the wrong thing")
- Uncomfortable truths ("Your SaaS metrics are lying to you")
The pattern: something is broken and here's proof.
*Title Patterns That Worked*
From the 159 viral posts, these structures appeared repeatedly:
1. [Authority] says [Controversial Thing] - 23 posts
2. Why [Common Belief] is Wrong - 19 posts
3. I [Did Thing] and [Unexpected Result] - 31 posts
4. [Company] is [Doing Bad Thing] - 18 posts
Average title length: 8.3 words. The sweet spot is 6-12 words.
*What Didn't Work*
Almost none of the viral posts were:
- Pure product launches
- "I'm excited to announce..."
- Listicles ("10 ways to...")
- Generic advice
*The Uncomfortable Implication*
If you want reach on HN, you're better off writing about what's broken than what you built.
This isn't cynicism – it's selection pressure. HN readers are skeptics. They've seen every pitch. What cuts through is useful criticism backed by evidence.
*For Founders*
Before your next launch post, ask: what problem am I exposing? What assumption am I challenging? What did I learn the hard way?
That's your hook.
---
Data: Built a tool that snapshots HN/GitHub/Reddit/ProductHunt every 30 minutes. Analyzed 1,576 snapshots, found 2,984 instances of score=100, deduped to 159 unique URLs, crawled 143 successfully, ran GPT-4 sentiment analysis on full article text.
Happy to share the raw data if anyone wants to dig deeper.
About that data though, just publish that. Throw the data and tooling up on github or huggingface if it's a massive dataset. Would be interested in comparing methodologies for deriving sentiment.
reply