WhatsApp for on-call: why engineers prefer it over PagerDuty at 2am
This isn't an argument against PagerDuty. PagerDuty is a well-built product that solves real problems. It's an argument about fit: most small engineering teams buy more incident management tooling than they actually need.
The friction argument
At 2am, friction kills response time. Every extra step between 'phone buzzes' and 'engineer understands the situation' adds directly to MTTR.
The PagerDuty mobile flow: phone buzzes → unlock → open PagerDuty app → find the incident → read the title → tap to see details → read the raw metric → open the AWS console → investigate.
The WhatsApp flow (with a diagnostic message): phone buzzes → unlock → read root cause and suggested fix already in the message → decide.
WhatsApp wins on friction not because it's a better alerting tool — it isn't — but because every engineer already has it open. There's no context switch to a work-only app. The message arrives where you already communicate.
The cost argument
PagerDuty's entry-level paid plans run roughly $20–25 per user per month. For a 3-person on-call team, that's $60–75/month or $720–900/year.
WhatsApp Business API pricing is conversation-based — Meta charges per 24-hour messaging window, with rates varying by country and conversation type (utility vs marketing). For a handful of incident notifications per month, the cost is typically under $5/month.
The cost gap is real, but it's not the main reason to use WhatsApp. The main reason is reduced friction for the teams that don't need the extra features PagerDuty provides.
Feature comparison
| Feature | PagerDuty | WhatsApp + diagnosis |
|---|---|---|
| On-call scheduling | ✓ Built-in with calendar UI | ✗ Manual — track in a shared calendar |
| Escalation policies | ✓ Automatic (page B if A doesn't ack in 15 min) | ✗ Manual — call the next person yourself |
| Acknowledgment tracking | ✓ Logged with timestamps | ✗ Not available |
| Phone call fallback | ✓ Calls you as a last resort | ✗ Not available |
| Alert routing by severity | ✓ Built-in rule engine | ✗ Manual triage |
| Post-incident reporting | ✓ MTTR, ack-time, and responder analytics | ✗ Not available |
| SLA audit trail | ✓ Yes | ✗ No |
| Incident context / diagnosis | ✗ Raw metric only (unless you enrich it) | ✓ Root cause + suggested fix in the message |
| Friction to read the alert | Medium — requires opening the PagerDuty app | Low — message in the app you already have open |
| Cost (3-person on-call team) | ~$60–75/month | ~$5/month via WhatsApp Business API |
If your team has formal SLAs with customers, on-call rotations that need to be managed, or needs an audit trail of who acknowledged what and when — PagerDuty earns its cost.
Which tool fits which team
| Team profile | Recommended approach |
|---|---|
| 2–5 engineers, no formal SLAs | WhatsApp with diagnostic messages — lower friction, much lower cost |
| 5–15 engineers, formal rotations needed | PagerDuty for paging and escalation + WhatsApp/Slack for pre-diagnosed context |
| 15+ engineers, SLA commitments | PagerDuty or OpsGenie — audit trail and reporting justify the cost |
The hybrid approach
Some teams run both: PagerDuty for the paging (reliable, has phone call fallback if the engineer doesn't respond), and WhatsApp or Slack for the actual diagnostic information. The page wakes you up; the WhatsApp message tells you what's wrong.
This adds some complexity but captures the best of both: the escalation reliability of PagerDuty and the context richness of a pre-diagnosed message.
How AWS CloudWatch connects to WhatsApp
The path from a CloudWatch alarm to a WhatsApp message goes through three hops:
- CloudWatch Alarm → SNS Topic (built-in AWS integration)
- SNS Topic → Lambda function (subscribed to the topic)
- Lambda function → WhatsApp Business API (HTTP call to Meta's Cloud API)
The Lambda function receives the raw SNS payload (the alarm notification), optionally queries CloudWatch Logs Insights and CloudWatch metrics for context, formats a diagnostic message, and sends it via the WhatsApp Business API to a specific phone number.
The value is entirely in the enrichment step — the Lambda querying logs and metrics before sending the message. Skip the enrichment and you've just forwarded a raw alarm to a different channel, which doesn't improve anything.
Frequently asked questions
Can I use WhatsApp instead of PagerDuty for on-call alerting?
Yes, for teams of 2–5 engineers with no formal SLA commitments. WhatsApp with diagnostic messages has lower friction and costs ~$5/month via WhatsApp Business API vs ~$60–75/month for PagerDuty. The tradeoff is no escalation policies, no acknowledgment tracking, and no phone-call fallback. The approach works when the message contains a pre-computed root cause, not just a raw metric value.
What does WhatsApp Business API cost for on-call incident alerts?
WhatsApp Business API pricing is conversation-based — Meta charges per 24-hour messaging window. For a typical startup with a handful of incident notifications per month, the cost is usually under $5/month. This compares to PagerDuty's ~$20–25 per user per month, making WhatsApp 10–15× cheaper for a 3-person on-call team.
How does a CloudWatch alarm connect to WhatsApp?
The path is three hops: CloudWatch Alarm → SNS Topic (built-in AWS integration), SNS Topic → Lambda function (subscribed to the topic), Lambda function → WhatsApp Business API (HTTP call to Meta's Cloud API). The Lambda optionally queries CloudWatch Logs Insights and metrics before sending, producing a diagnostic message rather than a raw alert. The value is entirely in this enrichment step.
What is the hybrid on-call approach and when should teams use it?
The hybrid approach uses PagerDuty for reliable paging (phone-call fallback if the primary doesn't respond) and WhatsApp or Slack for the diagnostic content. The page wakes you up; the WhatsApp message tells you what's wrong. Teams with formal on-call rotations that need escalation reliability but also want rich incident context use this pattern.
When does PagerDuty justify its cost over WhatsApp?
PagerDuty earns its cost when you have: formal SLA commitments requiring an audit trail, on-call rotations with more than 5 engineers needing automated escalation, customers who can contractually penalise you for slow response, or a need for post-incident MTTR analytics. For teams with none of these, WhatsApp is the pragmatic choice.
Related reading
Still debugging incidents manually?
ConvOps does this automatically — root cause in under 60 seconds, delivered to WhatsApp or Slack.