Chapter 1 · Lesson 2

What is Serverless?

Serverless doesn't mean "no servers" — it means you never think about them. Learn what that shift really looks like, and when it pays off versus when to think twice.

From physical racks to functions

Every application needs compute somewhere. The question is: how much of that infrastructure do you own and operate?

The traditional stack

In the early days of the web, you rented rack space (or bought hardware outright), installed an OS, patched it, configured a web server, managed TLS certificates, provisioned capacity for your busiest day, and hoped nothing failed at 2 a.m. Virtual machines (VMs) on providers like EC2 removed the physical hardware concern but kept everything else squarely on your plate. Containers (Docker, ECS, EKS) raised the abstraction level again — you stopped caring about the OS image and started shipping app code inside portable units — but you still had to decide how many containers to run, how to scale them, and how to handle cluster health.

Serverless flips the model

With serverless, the cloud provider owns all of the undifferentiated heavy lifting below your application code. AWS Lambda, for example, takes a zip file (or container image) containing your function, runs it on demand in response to an event, scales from zero to thousands of concurrent executions automatically, and charges you only for the milliseconds your code actually ran. You never SSH into a machine, never patch a kernel, and never size a fleet.

Shared responsibility by model

Responsibility Physical / On-Prem VM (EC2) Containers (ECS/EKS) Serverless (Lambda)
Data center & hardware You AWS AWS AWS
Host OS patching You You You (nodes) AWS
Runtime / language version You You You (image) AWS (managed runtimes)
Capacity planning & scaling You You You AWS (automatic)
High availability You You Shared AWS (built-in)
Application code & logic You You You You
Business data & security You You You You
Traditional (EC2 / VM) Serverless (Lambda) Your App Code Runtime & Dependencies OS Patching & Security Capacity Planning & Scaling High Availability / Load Balancer Physical Hardware (AWS) You manage AWS manages Your App Code Runtime (Node 20, Python, etc.) — AWS OS Patching & Security — AWS Auto Scaling (0 → ∞) — AWS High Availability — AWS Physical Hardware — AWS You manage AWS manages
Responsibility split: traditional VM vs. serverless. In the serverless model, every layer below your application code is owned and operated by AWS.

The three core benefits

1. No server management

You ship a function. AWS runs it. There are no AMIs to build, no Auto Scaling Groups to tune, no load balancer health checks to configure. This isn't just convenient — it changes how you think about infrastructure. Small teams can build and operate services that would have previously required a dedicated DevOps function.

2. Automatic scaling

Lambda scales horizontally and instantly. Each incoming event gets its own execution environment (up to the concurrency limit, which defaults to 1,000 per region and can be raised). This means a spike from 1 request per second to 10,000 requires zero action on your part. When traffic drops to zero, so does your infrastructure footprint — and your bill.

Concurrency vs. instances
Lambda concurrency is the number of function executions happening simultaneously, not the number of "servers". AWS manages the underlying fleet invisibly. You can set a reserved concurrency cap per function to throttle runaway invocations, or provisioned concurrency to pre-warm environments for latency-sensitive paths.

3. Pay per invocation

Lambda pricing has two dimensions: the number of requests (first 1 million free per month, then $0.20 per million) and the duration (measured in GB-seconds — memory allocated × execution time, rounded up to 1 ms). If your API receives 500,000 requests a month, your Lambda compute bill may literally be $0. Compare that to an always-on EC2 instance that costs money whether or not a single request arrives.

When serverless shines

Serverless is not a universal hammer, but it fits a remarkably broad set of workloads:

  • REST and GraphQL APIs / backends. Pair Lambda with API Gateway or an Application Load Balancer. Each route maps to one or more functions. This is the most common Lambda pattern and what this course focuses on.
  • File and image processing. An S3 ObjectCreated event triggers a Lambda that resizes an uploaded image, extracts metadata, or runs a virus scan — then stores the result. No polling, no cron, no workers.
  • Scheduled jobs. EventBridge Scheduler (or a cron rule) invokes a Lambda on any schedule you choose — daily report generation, database cleanup, cache warm-up. No server needs to be running 24/7 for a job that fires once a day.
  • Event-driven processing. Lambda integrates natively with SQS, SNS, Kinesis, DynamoDB Streams, and EventBridge. Fan-out architectures, decoupled microservices, and audit pipelines are all natural fits.
  • Webhooks. Third-party services (Stripe, GitHub, Twilio) POST events to your HTTPS endpoint. A Lambda processes the payload and responds in under a second. You pay for only those milliseconds.
  • Authentication hooks. Cognito triggers (pre-sign-up, post-confirmation, custom auth) let you add business logic to your auth flow without running a dedicated auth service.
Rule of thumb
If a workload is event-driven, variable or bursty in traffic, and each unit of work completes in under 15 minutes, serverless is almost always worth evaluating first.

When to think twice

An honest course tells you where the tool has edges. Here are the main situations where serverless deserves extra scrutiny:

  • Long-running or compute-heavy workloads. Lambda has a maximum execution timeout of 15 minutes. Video transcoding, large ML training jobs, or complex batch ETL that runs for hours belongs on EC2, ECS Fargate, or AWS Batch. If you can decompose the work into sub-15-minute chunks, Lambda can still work — but the architecture gets more complex.
  • Steady, high-throughput traffic. Lambda's pricing model is economical for variable loads, but if you run millions of executions per minute around the clock with predictable volume, a dedicated fleet of containers or reserved EC2 instances may be cheaper. Do the math before committing either way.
  • Cold-start-sensitive applications. When a Lambda function hasn't been invoked recently, AWS must initialize a new execution environment — your code, runtime, and SDK initialization all happen before the first line of your handler runs. For a Node.js 20 function with a small bundle this is typically 50–300 ms, which is invisible to most APIs. For Java or .NET functions, or functions with heavy SDK imports, cold starts can exceed 1 second. Provisioned concurrency solves this at extra cost, but it narrows the pricing advantage.
  • Heavy stateful processing. Lambda execution environments are ephemeral. Any in-memory state is gone when the environment is recycled. If your workload depends on large, mutable in-memory data structures that must survive across requests (e.g., a real-time game server or a complex session-based workflow), you'll need to externalize all state to DynamoDB, ElastiCache, or S3 — and that round-trip adds latency.
  • Vendor lock-in concerns. Lambda functions, SAM templates, EventBridge rules, and API Gateway configurations are AWS-specific. Migrating to another cloud or back to on-premises requires meaningful re-architecture. If portability is a hard requirement, containerized workloads on Kubernetes give you more flexibility.
Hybrid architectures are common
Most production systems on AWS are not purely serverless or purely traditional. A typical pattern: Lambda functions handle API requests and events, while a long-running ECS Fargate task processes a nightly batch job, and an ElastiCache cluster holds shared session data. Pick the right tool per workload.
Lesson Summary
  • Serverless means the cloud provider manages everything below your application code — OS, runtime, scaling, and availability — so you focus entirely on business logic.
  • The three headline benefits are no server management, automatic scaling from zero to massive concurrency, and a pay-per-invocation model that makes bursty workloads very cost-efficient.
  • Serverless excels for APIs, event-driven pipelines, file processing, scheduled jobs, and webhooks — workloads that are bursty, short-lived, and event-triggered.
  • Know the trade-offs: Lambda has a 15-minute timeout, cold starts add latency, and per-invocation pricing may be less efficient than reserved capacity for steady high-volume workloads.
  • Vendor lock-in is real — but for most startups and internal tools the operational savings outweigh the portability cost. Evaluate this against your organisation's requirements.