Slack's Green Dot Nightmare: Presence State Architecture

The Paradox of the Simplest Feature

In enterprise messaging apps like Slack or Microsoft Teams, no single feature communicates "real-time" more effectively than a simple green dot (Presence State). It tells you a colleague is online and ready to respond.
From a UI/UX perspective, it is a rudimentary feature. But from a System Design perspective, when scaling to tens of millions of concurrent users, the green dot becomes a bandwidth and CPU nightmare.

The Trade-off Rule: Perfection in distributed systems is the enemy of scalability. Demanding absolute, millisecond-level accuracy for the online statuses of millions of users will directly crash your infrastructure.

The O(N^2) Problem and the Collapse of Traditional Pub/Sub

The initial architecture of most chat applications relies on the Pub/Sub (Publish/Subscribe)

Metric	No Batching (Eager Push)	With Batching (Every 2s)	Impact
Requests/sec	100,000 req/s	1,000 req/s	99% reduction in server load
Display Latency	~50ms	~2000ms	Completely acceptable UX
Mobile Battery	Heavy drain (Radio always on)	Low drain (Radio sleeps)	Significantly improved battery life

Slack's "Green Dot" Problem: The Nightmare of Real-Time Architecture

The Paradox of the Simplest Feature

The O(N^2) Problem and the Collapse of Traditional Pub/Sub

"Act as an Expert": Why Prompt Personas Destroy AI Accuracy

Slack's Solution: Refactoring Presence Architecture

1. Shifting from "Eager" to "Lazy" Loading (View-based Subscription)

2. Decoupling the Presence Service

3. Batching & Debouncing

The Takeaway for Product & System Design

Database Architecture 4: Sharding: Breaking Physical Limits & Operational Pain