Allocating for Non-Functional Requirements
The 20% rule for infrastructure work sounds sensible until you watch it evaporate every sprint. Security, performance, and compliance become invisible until they become emergencies.
I was reviewing a team’s sprint history last month, looking for patterns in their velocity fluctuations. They had a stated policy: 20% of every sprint goes to non-functional work. Security hardening. Performance improvements. Compliance requirements. The foundational stuff.
Over twelve sprints, the actual allocation averaged 6%. And that 6% happened almost entirely in two sprints where an external audit was imminent.
When I asked the engineering lead about it, she just shrugged. “Features always win.”
The Theory Everyone Agrees With
The strategic argument for dedicated non-functional capacity is solid. This work is like maintaining your car’s engine. Skip it long enough and you’re not just risking a breakdown; you’re guaranteeing one. Compound interest works in reverse too.
Security debt accumulates until you’re explaining a breach to customers. Performance degradation creeps until your app is noticeably slower than competitors. Compliance gaps widen until a regulation change catches you exposed.
The 20% rule, or whatever percentage your organisation uses, exists because leaders recognise that leaving this work to “when we have time” means it never happens. Protected capacity. Predictable allocation. The adults in the room being responsible.
Protected capacity for non-functional work is a strategy for avoiding future crises. The problem is that avoiding crises is invisible work.
Every VP of Engineering I’ve talked to endorses some version of this principle. It shows up in team working agreements. It gets mentioned in sprint planning. It sounds so reasonable.
What Actually Happens
Here’s the pattern I’ve watched play out at maybe fifteen different companies now.
Sprint planning starts. The 20% allocation is acknowledged. Then someone mentions the feature that sales promised a customer. Or the executive dashboard that leadership wants by end of quarter. Or the integration that a partner is waiting on.
Suddenly the conversation shifts. “Can we push the security work to next sprint? We’re so close on this feature.” And because the security work doesn’t have a customer name attached to it, doesn’t have a sales rep in the room advocating for it, doesn’t have a deadline that triggers visible consequences... it yields.
Next sprint, the same thing happens. And the sprint after that.
The 20% becomes 10% becomes “we’ll do a dedicated infrastructure sprint later” becomes a vague line item on a roadmap that never quite arrives.
The non-functional work only resurfaces when it becomes a blocker. An enterprise prospect requires SOC 2 compliance before signing. Page load times cross a threshold where users start complaining. A security scan fails and delays a release.
Now it’s urgent. Now it gets priority. Now everyone’s scrambling.
The Translation Layer Breakdown
This is a strategy-execution gap, but it’s a specific kind. The strategy isn’t poorly conceived. The execution teams aren’t ignoring it maliciously. The breakdown happens in the translation layer, where abstract commitments meet concrete sprint decisions.
“20% for non-functional work” is a policy. But policies don’t make tradeoff decisions in sprint planning. People do. And those people are sitting in a room where the feature work has names, faces, and consequences attached to it, and the infrastructure work is a line item that says “ongoing security improvements.”
The feature has a champion. The non-functional work has a category.
I’ve seen teams try to fix this by assigning ownership. A “platform PM” or “infrastructure lead” who advocates for this work the same way a product PM advocates for features. It helps. But it also just shifts the political dynamic. Now you have two people arguing for capacity instead of one work category being implicitly junior to another.
The work that prevents problems will always struggle against the work that creates visible value. Prevention is invisible until it fails.
The Black Box Problem
Here’s where it gets properly frustrating. When non-functional work does happen, it often becomes illegible to the rest of the organisation.
“What did the platform team do last sprint?” “Infrastructure improvements.” “What does that mean?” Silence, or an explanation so technical that everyone’s eyes glaze over.
This isn’t the engineers’ fault. Explaining why you refactored the authentication service to reduce latency by 40 milliseconds is genuinely hard to translate into business value. It’s real work. It matters. But it doesn’t demo well.
So the work becomes a black box. Capacity goes in, something presumably happens, and unless there’s a visible outcome, the investment feels abstract. This makes it even harder to protect that capacity next sprint. Leadership starts asking why the platform team needs four engineers when nobody can articulate what they delivered.
The teams that handle this well have learned to make infrastructure work visible. Dashboards showing performance trends. Security scorecards. Compliance checklists with progress indicators. Not because the work wasn’t valuable before, but because value that can’t be seen can’t be defended.
The Uncomfortable Middle Position
I’ve gone back and forth on where I land here.
Part of me thinks the “only when it unblocks revenue” approach is actually honest. It acknowledges that organisations have limited capacity and competing priorities. It forces infrastructure work to justify itself against concrete outcomes rather than abstract principles. It prevents the platform team from becoming a comfortable backwater where engineers do interesting technical work that doesn’t connect to business reality.
Spotify went through a version of this. Their platform teams were given significant autonomy, which led to impressive technical infrastructure but also some work that was, charitably, gold-plating. They’ve since moved toward models where infrastructure investment ties more directly to product needs.
But I’ve also watched the “only when urgent” approach create genuinely dangerous situations. A company I advised deferred security work for eighteen months because it never quite rose to the top of the priority list. Then they had an incident. The remediation cost was roughly 50x what the prevention would have been, and that’s before counting the customer trust they lost.
The 20% rule might be a useful fiction. But fictions that prevent disasters have value, even if they’re not perfectly efficient.
The Question Nobody Wants to Answer
Maybe the real issue is that we’re trying to solve a political problem with a capacity planning framework.
The 20% rule assumes that the obstacle is scheduling. That if we just allocate the time, the work will happen. But the obstacle is usually incentives. Feature delivery gets celebrated. Infrastructure work gets a polite nod at best. Performance reviews reward visible impact. Invisible prevention doesn’t make careers.
Until that changes, protected capacity will keep eroding sprint by sprint, and teams will keep lurching from normalcy to crisis mode when the deferred work finally catches up with them.
I don’t know how to fix the incentive structure. I’m not sure anyone does.
But I’m increasingly convinced that the 20% conversation is happening at the wrong level of abstraction. We’re arguing about sprint allocation when we should be asking why the work needs protection in the first place.
The answer to that question is uncomfortable enough that most organisations would rather just keep debating percentages.

