When to reach for it
- You have a working legacy in production and a big-bang rewrite has been correctly identified as suicide
- The legacy has a seam — HTTP boundary, job queue, database view — where traffic can be intercepted and routed
- The migration will run for months or years and needs to keep delivering business value the whole time
- You can split along bounded contexts where each slice is migratable independently
- Leadership understands this isn't a six-month project and won't pull the plug at the 70% mark
What it actually costs
Two systems in production for the entire window. Two deploys, two on-call rotations, two ways to query the same data, two places to add this morning's urgent fix. The router (proxy, gateway, feature flag) is the most critical part of the architecture and is rarely treated as such. Data consistency between old and new is its own subproject — dual writes, CDC, reconciliation. And the last 10% — niche features, the quarterly batch nobody understands — takes longer than the first 90%.
The failure mode nobody mentions
Strangler that never strangles. Three years in, the new system handles 80% of traffic, the legacy handles the gnarly 20% nobody wants to touch, and the team has moved to greenfield work. The router becomes permanent infrastructure, the dual-system tax becomes a line item nobody questions. Set a date and a budget for the final cutover before you start, and treat the last mile as a real project.
When not to use it
The legacy is small enough to rewrite in a month, or already off — strangler is a long, expensive game, only worth it when a clean rewrite isn't an option.