How Load Balancers Work
A load balancer is the traffic cop of distributed systems. It sits between clients and a pool of backend servers, distributing incoming requests so no single server bears the full load. Without one, a single server becomes both a bottleneck and a single point of failure.
What Is a Load Balancer?
A load balancer accepts connections from clients and forwards them to one of several backend servers based on a configured algorithm. From the client's perspective, there is only one endpoint — the load balancer's IP or DNS name. The server pool is invisible.
Algorithms
Round-Robin
Each incoming request is assigned to the next server in a circular sequence. Request 1 goes to Server A, request 2 to Server B, request 3 to Server C, request 4 back to Server A. Simple and effective when all servers have similar capacity and workload.
Least Connections
Routes each new request to the server with the fewest active connections at that moment. Better than round-robin when requests have varying processing times — a slow request on Server A means it gets fewer new connections while it's busy.
IP Hash
Hashes the client's IP address to consistently route requests from the same client to the same server. Useful for applications with server-side session state that isn't shared across the pool.
Weighted Round-Robin
Assigns a weight to each server proportional to its capacity. A server with weight 3 receives three requests for every one received by a server with weight 1. Use this when servers have heterogeneous hardware.
Layer 4 vs Layer 7
| Property | L4 (Transport) | L7 (Application) |
|---|---|---|
| Operates at | TCP/UDP level | HTTP/HTTPS level |
| Can inspect | IP, port, protocol | Headers, URL path, cookies, body |
| Performance | Faster (less processing) | Slightly slower (parses HTTP) |
| Routing rules | Port-based only | Path-based, host-based, header-based |
| TLS termination | Pass-through only | Terminate and re-encrypt |
| AWS equivalent | NLB (Network LB) | ALB (Application LB) |
For most web services, use an L7 load balancer. It enables path-based routing (/api/* to one service, /static/* to another), host-based routing (api.example.com vs app.example.com), and SSL termination so your backend servers don't need certificates.
Health Checks
A load balancer must know which servers are healthy before routing traffic to them.
Active Health Checks
The load balancer periodically sends requests (typically GET /health) to each server. If a server fails to respond within the timeout or returns a non-2xx status for a configured number of consecutive checks, it is marked unhealthy and removed from the pool.
Passive Health Checks
The load balancer observes real traffic responses. If a backend returns 5xx errors or times out for a fraction of requests, it is marked unhealthy. Less intrusive but slower to detect failures.
Sticky Sessions
Sticky sessions (session affinity) ensure all requests from a particular client always go to the same backend server. This is sometimes needed for applications that store session state in memory on the server.
Cookie-Based Affinity
The load balancer injects a cookie (e.g. AWSALB on AWS ALB) that identifies the target server. On subsequent requests, the LB reads the cookie and routes to the same server.
IP-Based Affinity
Hashes the client IP to a server. Simpler but less reliable — many clients share a single IP behind NAT or a corporate proxy.
Real-World Examples
- NGINX — Can act as an L7 load balancer with upstream blocks. Supports round-robin, least_conn, ip_hash, and weighted variants. Free and widely deployed.
- HAProxy — Purpose-built, high-performance L4/L7 load balancer. Extremely configurable and commonly used for TCP load balancing.
- AWS ALB — Managed L7 LB with path-based routing, WebSocket support, gRPC support, and native integration with AWS WAF and Cognito.
- AWS NLB — Managed L4 LB for ultra-low latency use cases. Supports static IPs and Elastic IPs, useful when clients need to whitelist specific IPs.
- Cloudflare — Global load balancer operating at the DNS level with health checks, geo-steering, and failover across cloud providers.