All insights
EngineeringProcess5 min read
The hotfix workflow you'll wish you had
When production is on fire, your hotfix workflow is what stands between a 30-minute incident and a 4-hour one. Most teams discover theirs is broken during the incident.
Most companies design their hotfix workflow during a calm afternoon and test it during a 3am production incident. The gap between the design and what actually happens under pressure is usually larger than anyone wants to admit.
What a working hotfix flow looks like
- A documented one-page runbook visible from the incident channel.
- A reduced-process path that bypasses normal review for genuine emergencies.
- A canary deploy for hotfixes — "emergency" doesn't mean "skip safety."
- A post-hotfix follow-up that lands the proper fix and tests within a week.
Test it before you need it
Run a quarterly hotfix drill. Pretend production is broken, deploy a no-op change through the hotfix path, time it. The first time you do this, the result will surprise you. The second time, much less so.
Your hotfix process works in theory. Pressure-test it before you need it.