Chapter 1 · Lesson 3

Functional vs Non-Functional Requirements

Non-functional requirements, not features, drive every interesting architectural decision

Two kinds of requirement, two kinds of conversation

Every system has two layers of requirements. The first describes what the system does: the verbs, the user-visible behaviors, the features. The second describes how well it must do them: the constraints under which those behaviors must hold up. The two map to two very different conversations.

  • Functional requirements (FRs). "A user can post a message. A driver can be matched to a rider. An order can be refunded." These are the behaviors a product manager writes on a whiteboard.
  • Non-functional requirements (NFRs). "p99 message-post latency < 200ms at 50k QPS. Match a driver within 5s in 95% of cases. Refunds reconcile within 24h with zero double-credits." These are the constraints that decide whether the same FR set produces Twitter or a wedding-photo site.

Most teams have rigorous FR conversations and impoverished NFR conversations. That asymmetry is the single biggest predictor of an architecture that surprises everyone in year two.

FUNCTIONAL "What does it do?" • Post a message • Search by author • Delete one's own post • Verbs & behaviors NON-FUNCTIONAL "How well? Under what stress?" • p99 latency < 200ms • 50k QPS peak • 99.99% available • Numbers & constraints
The same FR list produces different systems depending on the NFRs attached.

Why NFRs drive architecture, not FRs

Consider two systems with identical functional requirements: "users can post short text messages and see other people's posts." That FR is satisfied by:

  • A single Postgres instance with a Flask app. Costs $50/mo. Serves 1,000 users.
  • A globally-distributed, multi-region, fanout-on-write timeline system with edge caching, push notifications, and 14 microservices. Costs $50M/mo. Serves Twitter.

Both satisfy the FR. The difference between them is entirely the NFRs — peak throughput, p99 latency, availability target, geographic distribution, durability of reposts. FRs decide the shape of your data model; NFRs decide the shape of your architecture.

Diagnostic question
When someone shows you an architecture diagram and you can't tell what makes it interesting, ask: "Which NFR forced this choice?" If the answer is "I'm not sure" or "it just felt right," you're looking at decoration, not architecture.

The five categories of NFR worth tracking

NFRs sprawl. To keep them organized, classify each one into one of five buckets. Most systems have at least one binding NFR in each:

1. Performance

Latency, throughput, concurrency. Always specify with a percentile and a load condition. "Fast" is not an NFR. "p99 latency < 200ms at 50k QPS sustained" is.

2. Reliability

Availability, durability, fault tolerance, RPO/RTO. "Highly available" is not an NFR. "99.99% availability measured monthly, RPO < 30s, RTO < 5 min" is.

3. Security & compliance

AuthN/Z model, encryption, audit, data residency, regulatory standards (PCI, HIPAA, SOC2, GDPR). These are often binding in ways engineers don't see until late — "we need PCI" can rewrite an architecture.

4. Operability

Deploy frequency, rollback time, observability, configurability, on-call burden. Operational NFRs are routinely under-specified and create slow-burn pain. "We must be able to roll back any deploy in <5 min" has dramatic implications.

5. Cost

The honest NFR most slides omit. "Total infrastructure spend < $X/month at Y users" is a real constraint. A 10× cheaper architecture that meets every other NFR is usually the right answer, even if a "better" one exists on paper.

How to extract NFRs from a vague stakeholder

Stakeholders rarely volunteer NFRs. They will tell you "we need it to be fast" and consider their job done. Your job is to translate. A short script that almost always works:

  1. "What's the worst thing that could happen?" Their answers are durability and security requirements in disguise. "We can't lose any payment" → strong durability. "Users can't see each other's data" → tenant isolation.
  2. "What does the user experience if it's slow?" Their answers are latency requirements. "They'll bounce" → strict p99. "We have a loading spinner" → relaxed.
  3. "How many users at peak?" Plus "When is peak?" Their answers are throughput and burst-handling requirements.
  4. "What if it's down for an hour?" Their answers are availability and DR requirements. "We'd lose $X" sharpens it instantly.
  5. "What are we not allowed to do with this data?" Their answers are compliance and residency requirements.
Force a number
Any NFR without a number is aspirational. The discipline of writing it as "X under condition Y" forces the stakeholder to commit, and the commitment is what makes the NFR useful in tradeoff conversations later. "Reasonably available" is a wish; "99.95% availability measured over a rolling 30 days" is a contract.

The binding NFR is usually one or two

A typical system has a dozen NFRs on paper. Two or three of them are actually binding — they determine the shape of the architecture. The rest fall out for free or can be traded away. Identifying which is which is much of the work.

A useful test: if I tightened this NFR by 10×, would the architecture have to change? If yes, it's binding. If no, it's noise. For a typical OLTP system, latency-at-percentile and availability are binding; durability is easy if your DB is competent; cost binds at a different scale than the others.

NFRs are why architectures get rewritten

The most common story in systems engineering: the FRs are stable for years, but one NFR changes (10× more users, a regulator showing up, a new latency target from a competitor) and a working architecture suddenly stops working. The system did not break — its environment did. This is why "architecture is evolved, not chosen" (Lesson 1.1). The trigger is almost always an NFR shift.

Key takeaways

  • FRs describe what the system does. NFRs describe how well it must do them under stress.
  • NFRs — not FRs — drive every interesting architectural decision.
  • Group NFRs into five categories: performance, reliability, security/compliance, operability, cost.
  • An NFR without a number is a wish. Force a number with a unit and a condition.
  • Find the one or two binding NFRs; the rest fall out for free.
  • Architectures get rewritten because NFRs shift — anticipate which one might next.
Exercise
For your current system, write each NFR as a single sentence with a number and a condition. Mark the two you believe are binding. Now ask a teammate to do the same independently. The disagreement is the most valuable conversation your team has had in months.