Skip to content

Scalability

Scope

The ability of a system to handle a growing amount of work by adding resources, as well as the architectural patterns that enable this growth.

Why This Topic Exists

A system that is not designed to scale will eventually become a victim of its own success. Scalability is about designing systems that can grow to meet future demand without requiring a fundamental, and often costly, redesign.

Core Tradeoffs

  • Vertical vs. Horizontal Scaling: Scaling “up” by adding more resources (CPU, RAM) to a single machine versus scaling “out” by adding more machines to the system. Vertical scaling is simpler but has a finite limit; horizontal scaling is more complex but can be virtually limitless.
  • Stateful vs. Stateless Services: Stateless services are significantly easier to scale horizontally because any server can handle any request. Stateful services require careful management of session data, which complicates load balancing and scaling.
  • Scalability vs. Cost: Designing for massive, web-scale traffic from day one can be prohibitively expensive and is a form of premature optimization. The tradeoff is to build a system that can scale when needed without over-provisioning for current traffic.
  • Read vs. Write Scaling: Scaling read operations (e.g., through caching or read replicas) is often easier than scaling write operations, which may require more complex strategies like sharding.
  • The Database Bottleneck: The database is often the most difficult component of a system to scale and frequently becomes the bottleneck that limits the scalability of the entire application.
  • “Hot Spots” in Sharded Systems: In a system with sharded data, a poor choice of sharding key can lead to some shards becoming “hot” and overloaded while others remain idle, negating the benefits of horizontal scaling.
  • Hidden Dependencies on a Single Point of Failure: A system that appears to be horizontally scalable has a hidden, critical dependency on a component that cannot be easily scaled out (e.g., a legacy singleton service, a centralized lock manager).
  • Loss of Locality: As a system scales out, data and services become more distributed. This can lead to increased network latency as services need to communicate across the network to fulfill requests.

Interview Signals

Strong candidates will immediately discuss the difference between horizontal and vertical scaling. They should be able to talk about the importance of statelessness and how to handle state in a scalable way (e.g., using a distributed cache or database). They should also be able to describe common scaling patterns for different parts of the stack, such as using a CDN for static assets, load balancers for application servers, and read replicas or sharding for databases.

  • Performance
  • Databases
  • Load Balancing
  • Caching
  • Sharding