Backfills: the migration's evil twin
Changing how data works going forward is the easy part. Fixing all the data that already exists is where projects quietly blow up.
When you change how a system handles data, the new behavior is usually straightforward. The hard part is the existing data — all the records created under the old rules that now need to be brought in line. That's a backfill, and it's where seemingly simple changes turn into multi-week ordeals.
History doesn't update itself
Adding a new field or rule going forward is easy. Applying it to millions of existing records — correctly, without downtime, without corrupting anything — is a real project. Teams routinely scope the forward change and forget the backfill, then discover the latter is the bulk of the work.
Treat it as its own project
- Plan the backfill explicitly, separate from the forward change.
- Run it in batches so it doesn't overwhelm the system or block live work.
- Make it resumable — backfills get interrupted and you don't want to restart from zero.
- Verify a sample before and after; a wrong backfill corrupts history at scale.
The new behavior is the easy half. Fixing all the data that already exists is the project nobody scoped.