The real cost of a 1-hour AWS outage for a 10-person startup

April 7, 20266 min read

Most founders calculate the cost of an outage by dividing monthly revenue by the hours in a month. That underestimates the real number by 3–5x. Here's the full picture.

The five cost categories

Direct revenue loss (immediate)
Engineering labor — investigation, fix, post-mortem (immediate)
Support overhead — tickets, proactive communication (immediate + short-term)
Customer trust — churn risk, renewal uncertainty (medium-term)
Opportunity cost — what didn't get built while engineers were firefighting (medium-term)

1. Direct revenue loss

Revenue loss = (Monthly revenue ÷ hours in month) × outage hours × affected user %

Hours in a month ≈ 730
Example: $20k MRR ÷ 730 = $27.40/hour × 1 hour × 100% users = $27.40

For most startups, direct revenue loss from a 1-hour outage is surprisingly small — $20–100 for a $10k–$50k MRR business. This is why founders feel like outages 'aren't that expensive.' They're looking at only one line of the cost model.

The number changes significantly if the outage hits a high-conversion window: a flash sale, the last day of a customer trial, a product launch with press coverage, or end of the month when B2B customers are using your product to close their own books.

2. Engineering labor — usually the largest immediate cost

Engineering cost = engineers involved × hourly loaded cost × hours spent

Loaded cost ≈ 2× base salary (includes benefits, equity, overhead)
$120k salary → ~$60/hr loaded → ~$120/hr total cost to the company

Example:
2 engineers × $120/hr × 3 hours (detection + fix + post-mortem) = $720

A 1-hour outage rarely costs 1 hour of engineering time. The realistic breakdown: 15–30 min to detect and understand, 30–60 min to fix, 15 min to verify and communicate, 1–2 hours for the post-mortem the next day. That's 3–5 hours of engineering time across the team per incident.

For a startup where 3 engineers are involved: 3 × $100/hr loaded × 4 hours = $1,200. That's 20× the direct revenue loss on a $20k MRR business.

3. Support overhead

A customer-facing outage generates support tickets. Volume depends on how visible the failure is, but expect 5–20 tickets for a B2B SaaS with 50–200 customers.

Support cost = tickets × avg resolution time × support hourly rate

15 tickets × 20 minutes × $40/hr = $200

Plus proactive communication:
- Status page update: 20 min
- Email to affected customers: 45 min
- Slack update to CS team: 15 min
≈ 1.5 hours of non-engineering time ≈ $60–90

Total support overhead for a medium-sized outage: $250–400. Small compared to engineering labor, but adds up across multiple incidents per month.

4. Customer trust — the expensive one

This is the hardest cost to quantify and the most dangerous to ignore.

For B2B SaaS: a serious outage during a trial period often ends the trial-to-paid conversion. A renewal scheduled in the next 30 days becomes uncertain. An enterprise customer in active evaluation makes notes. A customer who was about to expand their seat count reconsiders.

Trust cost model (approximate):
Customers at risk × churn probability increase × MRR × CLV multiplier

If 5 customers experienced the outage during a critical workflow:
- 10% higher churn probability × 5 customers × $200 MRR × 24 months CLV
= 0.10 × 5 × $200 × 24 = $2,400

The multiplier depends on when the outage happened and who was affected. An outage that hits 5 customers during a demo call costs 10× more in trust terms than one that happens at 3am on a Sunday.

5. Opportunity cost

The 4 hours your two best engineers spent on the incident are 4 hours they didn't spend on the feature that's blocking three sales deals. At a stage where feature velocity directly connects to revenue growth (most seed-stage companies), the opportunity cost is real and compounding.

This is hard to put an exact number on, but for most early-stage startups, 4 hours of your best engineers' time is worth $400–1,000 in foregone output — and more if those hours would have shipped something that closes a deal.

Worked example: $20k MRR, 1-hour outage

Cost category	Amount	Basis
Direct revenue loss	$27	($20k MRR ÷ 730 hrs) × 1 hr × 100% users affected
Engineering labor	$1,200	3 engineers × 4 hrs × $100/hr loaded cost
Support overhead	$290	15 tickets + proactive status page and customer emails
Customer trust risk	$1,500	Conservative — 5 customers at 10% higher churn × $200 MRR × 24-mo CLV
Opportunity cost	$600	4 engineering hours at $150/hr value foregone
Total	$3,617	18% of monthly revenue from a single 1-hour outage

At 2–4 incidents per month (common for fast-moving startups), the annualised cost runs $86k–$174k. Most founders see only the $27 direct revenue line.

What changes the math

Lever	Impact on total cost
Cut MTTR: 60 min → 15 min	Engineering labor drops ~75%; trust damage window shrinks proportionally
10-min detection delay	Non-linear cost increase — customers accumulate errors before support has context
2am vs business-hours incident	2am incidents typically take 2× longer; tired engineer, full team unavailable
Enterprise SLA with penalties	A $3k incident becomes a $30k one — contractual penalties dwarf all other lines

What to do with this

Track the cost of each incident, even roughly. Over 3 months, you'll see whether incidents are getting cheaper (process is improving) or staying flat (nothing is changing). The engineering labor line is the most controllable — it comes down almost directly with MTTR. The trust line is the most dangerous — it accrues silently until a renewal conversation reveals it.

Frequently asked questions

How much does a 1-hour AWS outage actually cost a startup?

For a $20k MRR startup, a realistic 1-hour outage costs approximately $3,600 — not the $27 in direct revenue loss most founders calculate. The full breakdown: $27 direct revenue, $1,200 engineering labor (3 engineers × 4 hours), $290 support overhead, $1,500 customer trust risk, and $600 opportunity cost. Engineering labor is almost always the largest immediate line item.

What is the most expensive hidden cost of an AWS outage?

Customer trust is the most dangerous cost because it accrues silently. An outage during a trial period often ends the conversion. A renewal in the next 30 days becomes uncertain. A customer considering seat expansion reconsiders. These costs don't appear in your bank account until 30–90 days later during renewal and expansion conversations.

How does reducing MTTR affect outage cost?

Cutting MTTR from 60 minutes to 15 minutes drops engineering labor costs by approximately 75% and shrinks the customer trust damage window proportionally. For a $20k MRR business, this converts a ~$3,600 incident into a ~$900 incident. At 2–4 incidents per month, the annualised saving from a 4× MTTR improvement is $65k–$130k.

When does a 1-hour AWS outage cost significantly more than average?

Four situations multiply outage cost non-linearly: hitting a high-conversion window (flash sale, trial expiry, end of month), hitting customers during a demo call, having enterprise SLA penalty clauses (a $3k incident becomes $30k), and 2am incidents which take 2× longer because the tired engineer is solo and the full team is unavailable.

What is the loaded cost of an engineer during an incident?

Loaded cost is approximately 2× base salary — it includes benefits, equity dilution, and overhead. A $120k salary equals roughly $120/hr in total cost to the company. For a 3-engineer incident taking 4 hours: 3 × $100/hr × 4 hours = $1,200 — typically 20× the direct revenue loss for a $20k MRR business during a 1-hour outage.

The real cost of a 1-hour AWS outage for a 10-person startup

April 7, 20266 min read

Most founders calculate the cost of an outage by dividing monthly revenue by the hours in a month. That underestimates the real number by 3–5x. Here's the full picture.

The five cost categories

Direct revenue loss (immediate)
Engineering labor — investigation, fix, post-mortem (immediate)
Support overhead — tickets, proactive communication (immediate + short-term)
Customer trust — churn risk, renewal uncertainty (medium-term)
Opportunity cost — what didn't get built while engineers were firefighting (medium-term)

1. Direct revenue loss

Revenue loss = (Monthly revenue ÷ hours in month) × outage hours × affected user %

Hours in a month ≈ 730
Example: $20k MRR ÷ 730 = $27.40/hour × 1 hour × 100% users = $27.40

2. Engineering labor — usually the largest immediate cost

Engineering cost = engineers involved × hourly loaded cost × hours spent

Loaded cost ≈ 2× base salary (includes benefits, equity, overhead)
$120k salary → ~$60/hr loaded → ~$120/hr total cost to the company

Example:
2 engineers × $120/hr × 3 hours (detection + fix + post-mortem) = $720

For a startup where 3 engineers are involved: 3 × $100/hr loaded × 4 hours = $1,200. That's 20× the direct revenue loss on a $20k MRR business.

3. Support overhead

A customer-facing outage generates support tickets. Volume depends on how visible the failure is, but expect 5–20 tickets for a B2B SaaS with 50–200 customers.

Support cost = tickets × avg resolution time × support hourly rate

15 tickets × 20 minutes × $40/hr = $200

Plus proactive communication:
- Status page update: 20 min
- Email to affected customers: 45 min
- Slack update to CS team: 15 min
≈ 1.5 hours of non-engineering time ≈ $60–90

Total support overhead for a medium-sized outage: $250–400. Small compared to engineering labor, but adds up across multiple incidents per month.

4. Customer trust — the expensive one

This is the hardest cost to quantify and the most dangerous to ignore.

Trust cost model (approximate):
Customers at risk × churn probability increase × MRR × CLV multiplier

If 5 customers experienced the outage during a critical workflow:
- 10% higher churn probability × 5 customers × $200 MRR × 24 months CLV
= 0.10 × 5 × $200 × 24 = $2,400

The multiplier depends on when the outage happened and who was affected. An outage that hits 5 customers during a demo call costs 10× more in trust terms than one that happens at 3am on a Sunday.

5. Opportunity cost

Worked example: $20k MRR, 1-hour outage

Cost category	Amount	Basis
Direct revenue loss	$27	($20k MRR ÷ 730 hrs) × 1 hr × 100% users affected
Engineering labor	$1,200	3 engineers × 4 hrs × $100/hr loaded cost
Support overhead	$290	15 tickets + proactive status page and customer emails
Customer trust risk	$1,500	Conservative — 5 customers at 10% higher churn × $200 MRR × 24-mo CLV
Opportunity cost	$600	4 engineering hours at $150/hr value foregone
Total	$3,617	18% of monthly revenue from a single 1-hour outage

At 2–4 incidents per month (common for fast-moving startups), the annualised cost runs $86k–$174k. Most founders see only the $27 direct revenue line.

What changes the math

Lever	Impact on total cost
Cut MTTR: 60 min → 15 min	Engineering labor drops ~75%; trust damage window shrinks proportionally
10-min detection delay	Non-linear cost increase — customers accumulate errors before support has context
2am vs business-hours incident	2am incidents typically take 2× longer; tired engineer, full team unavailable
Enterprise SLA with penalties	A $3k incident becomes a $30k one — contractual penalties dwarf all other lines

The real cost of a 1-hour AWS outage for a 10-person startup

The five cost categories

1. Direct revenue loss

2. Engineering labor — usually the largest immediate cost

3. Support overhead

4. Customer trust — the expensive one

5. Opportunity cost

Worked example: $20k MRR, 1-hour outage

What changes the math

What to do with this

Frequently asked questions

How much does a 1-hour AWS outage actually cost a startup?

What is the most expensive hidden cost of an AWS outage?

How does reducing MTTR affect outage cost?

When does a 1-hour AWS outage cost significantly more than average?

What is the loaded cost of an engineer during an incident?

Related reading

The real cost of a 1-hour AWS outage for a 10-person startup

The five cost categories

1. Direct revenue loss

2. Engineering labor — usually the largest immediate cost

3. Support overhead

4. Customer trust — the expensive one

5. Opportunity cost

Worked example: $20k MRR, 1-hour outage

What changes the math

What to do with this

Frequently asked questions

How much does a 1-hour AWS outage actually cost a startup?

What is the most expensive hidden cost of an AWS outage?

How does reducing MTTR affect outage cost?

When does a 1-hour AWS outage cost significantly more than average?

What is the loaded cost of an engineer during an incident?

Related reading