to navigate

to select

to close

On this page

System Design Basics

System design questions test your ability to architect scalable, reliable systems. This guide covers fundamental concepts frequently asked in Java backend interviews.

Scalability Concepts

Q: Vertical vs horizontal scaling?

	Vertical (Scale Up)	Horizontal (Scale Out)
Method	Bigger machine	More machines
Limit	Hardware ceiling	Near-unlimited
Downtime	Often requires restart	Add nodes live
Cost	Expensive at high end	Commodity hardware
Complexity	Low	Higher (distributed systems)

Java microservices typically scale horizontally.

Q: What is stateless vs stateful?

  Stateless: any server can handle any request
  → session in Redis/JWT, not server memory
  → easy to scale, load balance

Stateful: server holds session/state
  → sticky sessions or session replication
  → harder to scale

Design REST APIs to be stateless for scalability.

Load Balancing

  Client → Load Balancer → [Server 1, Server 2, Server 3]

Algorithm	Description
Round Robin	Rotate through servers
Least Connections	Send to least busy server
Weighted	More traffic to powerful servers
IP Hash	Same client → same server (sticky)

Java tools: Nginx, HAProxy, Spring Cloud LoadBalancer, Kubernetes Service.

Caching Strategies

Q: Where to cache?

  Browser → CDN → API Gateway → Application Cache → Database
                                    ↑
                              Redis / Caffeine

Strategy	Description	Use case
Cache-aside	App checks cache, loads from DB on miss	General purpose
Read-through	Cache loads from DB automatically	Simplified app code
Write-through	Write to cache and DB synchronously	Strong consistency
Write-behind	Write to cache, async to DB	Write-heavy workloads

Q: Cache invalidation strategies?

TTL — expire after time (simple, stale data possible)
Event-driven — invalidate on update (Kafka, Spring @CacheEvict)
Version-based — include version in cache key

Database Design

Q: SQL vs NoSQL — when to use each?

	SQL (PostgreSQL)	NoSQL (MongoDB, Redis)
Schema	Fixed, relational	Flexible, document/key-value
Transactions	ACID	Eventual consistency (varies)
Scaling	Vertical + read replicas	Horizontal sharding
Best for	Complex queries, joins	High write throughput, flexible schema

Q: Database sharding?

Split data across multiple databases by shard key:

  user_id % 4 = 0 → Shard 0
user_id % 4 = 1 → Shard 1
user_id % 4 = 2 → Shard 2
user_id % 4 = 3 → Shard 3

Challenges: cross-shard queries, rebalancing, hot shards.

Message Queues

Q: When to use async messaging?

Decouple services (order service → notification service)
Absorb traffic spikes (buffer requests)
Event-driven architecture (order created → inventory, shipping, analytics)

  Order Service → Kafka → [Inventory, Shipping, Analytics]
  (sync, fast)              (async, independent)

Common Design Questions

Q: Design a URL shortener.

Key components:

API — POST /shorten, GET /{code} redirect
Encoding — base62 of auto-increment ID or hash
Storage — Redis (hot) + PostgreSQL (persistent)
Cache — cache popular URLs in Redis
Scale — stateless API servers behind load balancer

Capacity estimate: 100M URLs × 500 bytes = 50GB storage.

Q: Design a rate limiter.

Approaches:

Token bucket — refill tokens at fixed rate
Sliding window — count requests in time window
Fixed window — count per time interval

  // Redis-based sliding window
String key = "rate:" + userId + ":" + (now / windowSize);
Long count = redis.incr(key);
if (count == 1) redis.expire(key, windowSize);
if (count > maxRequests) throw new RateLimitExceededException();

Q: Design a notification system.

  Event → Kafka → Notification Service → [Email, SMS, Push]
                      ↓
                 Template Engine
                      ↓
                 Delivery Queue (with retry)
                      ↓
                 Provider APIs (SendGrid, Twilio, FCM)

Key concerns: idempotency, retry with backoff, user preferences, delivery tracking.

CAP Theorem

In a partition, choose between:

Consistency — all nodes see same data
Availability — every request gets a response

Java examples:

CP — ZooKeeper, etcd (consistent but may reject requests)
AP — Cassandra, DynamoDB (available but may return stale data)

Most web applications choose AP with eventual consistency.

Interview Framework

Structure your answer:

Requirements — functional + non-functional (QPS, latency, storage)
Estimation — back-of-envelope calculations
High-level design — boxes and arrows
Deep dive — database schema, API design, key algorithms
Bottlenecks — identify and address scaling limits
Trade-offs — explain choices and alternatives

Spring Interview Questions

Two Sum

System Design Basics

Scalability Concepts link

Load Balancing link

Caching Strategies link

Database Design link

Message Queues link

Common Design Questions link

CAP Theorem link

Interview Framework link