Chapter 1 · Lesson 2

What is System Design?

How designing systems differs from writing code: components, data flow, and scale
What is system design Client Load Balancer App server App server Cache Database
System design is deciding how the pieces fit — clients, balancers, servers, caches, databases.

System design is the process of deciding how to build software that can handle real-world demands — at scale, reliably, and efficiently.

The simple definition

When you write a function, you decide the logic inside it. When you design a system, you decide how multiple components work together: what talks to what, where data lives, how failures are handled, and how the whole thing scales when a million people use it at once.

Think of it like this. Writing code is choosing the materials and shaping them. System design is deciding the blueprint for the building before construction begins.

What a "system" actually means here

In this context, a system is any software made up of multiple interacting parts. A single Python script is not a system. A web application with a frontend, an API server, a database, and a cache — that is a system. A messaging app that sends billions of notifications per day is a much more complex system, but the same principles apply.

Most real systems share common components:

  • Clients — the browsers, mobile apps, or services that make requests.
  • Servers — the machines that handle those requests and produce responses.
  • Databases — where data is stored persistently.
  • Caches — where frequently used data is kept temporarily for fast access.
  • Message queues — used to pass work between components asynchronously.
  • Load balancers — distribute traffic so no single server is overwhelmed.

System design versus software engineering

These two are related but distinct. Software engineering is mostly about writing correct code at the function or class level. System design is about how services, data stores, and infrastructure fit together at a much higher level of abstraction. In an interview, you are rarely asked to write code; you are asked to sketch an architecture and explain the decisions behind it.

A concrete example

Imagine you are building a URL shortener — something like Bitly. At a small scale, one server and one database is fine. But what happens when 100 million people use it? Now you need to think about:

  • How do you generate unique short codes quickly without collisions?
  • Where do you cache redirects so lookups are fast?
  • How do you store the mapping between short codes and original URLs?
  • What happens if your single database becomes the bottleneck?

Answering those questions — and making defensible choices — is what system design is about. You will build this exact system in Section 16.

graph LR A[Client / Browser] -->|short URL request| B[Load Balancer] B --> C[App Server 1] B --> D[App Server 2] C --> E[(Database)] C --> F[(Cache)] D --> E D --> F

A basic URL shortener architecture — one of the simplest real-world system design problems.

🎯 Interview Angle
  • Interviewers do not expect a perfect answer. They want to see that you understand the components, can explain trade-offs, and can have a structured conversation about design decisions.
  • A common opening question is: "Tell me the components you'd include in this system and why." Start broad, then go deep on the parts the interviewer shows interest in.
  • Avoid jumping to implementation details too early. Before writing anything, understand what the system needs to do and at what scale.
✓ Quick Check
What is the core difference between system design and writing code?
Writing code focuses on the logic inside individual functions or classes. System design focuses on how multiple services, databases, and infrastructure components work together to meet requirements at scale. System design operates at a higher level of abstraction.
Name three common components found in most software systems.
Clients (browsers/apps that send requests), servers (that handle requests), databases (persistent storage), caches (fast temporary storage), load balancers (traffic distribution), and message queues (async communication) — any three of these qualify.
Why does a URL shortener that works fine with 100 users start to struggle with 100 million users?
At small scale, a single server and database can handle all requests. At large scale, the database becomes a bottleneck, lookups slow down, and a single point of failure means all users are affected if it crashes. Scaling requires distributing load, caching frequent lookups, and adding redundancy.