Cheat Cards
Printable reference cards for quick review before your interview. Each card covers one essential topic.
Estimation NumbersMath
Time
- •1 day = 86,400 sec ≈ 10⁵
- •1 month = 2.6M sec ≈ 2.5 × 10⁶
- •1 year = 31.5M sec ≈ 3 × 10⁷
Storage
- •1 char = 1 byte (ASCII), 2-4 bytes (UTF-8)
- •1 int = 4 bytes, 1 long = 8 bytes
- •1 UUID = 16 bytes
- •1 KB = 10³ B, 1 MB = 10⁶, 1 GB = 10⁹, 1 TB = 10¹²
- •1 image ≈ 300 KB, 1 video min ≈ 50 MB
Users
- •DAU / MAU ratio ≈ 0.3-0.5
- •Peak traffic ≈ 2-5× average
- •QPS = DAU × actions/day ÷ 86,400
Latency
- •L1 cache: 1 ns
- •L2 cache: 4 ns
- •RAM: 100 ns
- •SSD random: 100 μs
- •HDD seek: 10 ms
- •Same DC RTT: 0.5 ms
- •Cross-region: 150 ms
Database SelectionStorage
SQL (PostgreSQL, MySQL)
- •ACID transactions, strong consistency
- •Complex joins, aggregations
- •Structured data with relationships
- •Up to ~10TB comfortably, ~10K QPS
NoSQL Document (MongoDB)
- •Flexible schema, nested documents
- •Horizontal scaling built-in
- •No joins (denormalized data)
- •Good for: catalogs, user profiles, CMS
Key-Value (Redis, DynamoDB)
- •Sub-millisecond reads (Redis: in-memory)
- •Simple get/set by key
- •Good for: cache, sessions, counters
- •Redis: 100K+ QPS single node
Wide-Column (Cassandra)
- •Write-optimized (LSM tree)
- •Distributed, no single point of failure
- •Good for: time-series, IoT, chat messages
- •High write throughput, eventual consistency
Decision Rule of Thumb
- •Need joins? → SQL
- •Need speed? → Redis
- •Need writes? → Cassandra
- •Need search? → Elasticsearch
- •Need graphs? → Neo4j
Caching StrategyPerformance
Patterns
- •Cache-Aside: app checks cache → miss → DB → populate cache
- •Write-Through: write cache + DB together
- •Write-Behind: write cache, async flush to DB
- •Read-Through: cache auto-fetches on miss
Eviction Policies
- •LRU: remove least recently used (most common)
- •LFU: remove least frequently used
- •TTL: auto-expire after time period
- •FIFO: first in, first out
Common Issues
- •Thundering herd: use locks + staggered TTL
- •Cache penetration: bloom filter for non-existent keys
- •Inconsistency: use TTL + event-driven invalidation
- •Cold start: pre-warm cache on deploy
Key Numbers
- •Redis: 100K+ ops/sec, <1ms latency
- •Memcached: 1M+ ops/sec (simple get/set)
- •Target cache hit ratio: >90%
- •80/20 rule: 20% of data serves 80% of requests
Load BalancingNetworking
Algorithms
- •Round Robin: simple rotation
- •Weighted RR: proportional to capacity
- •Least Connections: route to least busy
- •IP Hash: consistent routing by client
- •Random: simple but surprisingly effective
L4 vs L7
- •L4 (Transport): TCP/UDP level, fast, no inspection
- •L7 (Application): HTTP aware, path/header routing
- •L4 for: WebSocket, database proxy, raw TCP
- •L7 for: microservice routing, A/B testing, SSL termination
Health Checks
- •HTTP GET /health → 200 OK
- •Interval: 10-30 seconds
- •Failure threshold: 2-3 consecutive failures
- •Recovery: 2-3 consecutive successes
Common Products
- •Cloud: AWS ALB/NLB, GCP LB, Azure LB
- •Software: Nginx, HAProxy, Envoy
- •DNS-based: Route53, Cloudflare
Message QueuesArchitecture
When to Use
- •Async processing (email, notifications, image resize)
- •Decouple services for independent scaling
- •Buffer traffic spikes (absorb bursts)
- •Fan-out: one event → multiple consumers
- •Guaranteed delivery (at-least-once)
Kafka vs RabbitMQ vs SQS
- •Kafka: ordered log, high throughput, replay
- •RabbitMQ: traditional queue, routing, low latency
- •SQS: managed, no ops, serverless friendly
- •Kafka for: event streaming, analytics, logs
- •RabbitMQ for: task queues, RPC patterns
Key Concepts
- •At-least-once: may deliver duplicates → consumers must be idempotent
- •At-most-once: may lose messages (rarely used)
- •Exactly-once: hard! Use idempotency + dedup
- •Partition: ordered within partition, parallel across
- •Consumer group: each message processed by one consumer per group
Numbers
- •Kafka: 1M+ msg/sec per cluster
- •SQS: 3000 msg/sec per queue (standard)
- •RabbitMQ: 50K msg/sec per cluster
- •Always monitor: queue depth, consumer lag
Consistency ModelsDistributed
CAP Theorem
- •C: Consistency (every read gets latest write)
- •A: Availability (every request gets a response)
- •P: Partition tolerance (system works despite network splits)
- •Must choose CP or AP during partition
Models (Strongest → Weakest)
- •Linearizable: real-time ordering (slowest)
- •Sequential: global total order
- •Causal: respects cause-effect relationships
- •Eventual: all replicas converge eventually (fastest)
When to Use What
- •Strong: banking, inventory (prevent overselling)
- •Eventual: social media feeds, likes, views
- •Causal: messaging (see messages in order)
- •Most systems: eventual consistency is fine!
Techniques
- •Quorum: W+R > N (e.g., W=2,R=2,N=3)
- •Vector clocks: track causal ordering
- •CRDTs: conflict-free replicated data types
- •2PC: two-phase commit for distributed transactions
- •Saga: compensating transactions for microservices
API Design PatternsAPI
REST Best Practices
- •Nouns for resources: /users, /posts/{id}
- •HTTP verbs: GET/POST/PUT/PATCH/DELETE
- •Status codes: 200, 201, 400, 401, 404, 500
- •Pagination: cursor-based (not offset)
- •Versioning: /v1/users or Accept header
Rate Limiting
- •Token Bucket: smooth, allows bursts
- •Sliding Window: precise, more memory
- •Fixed Window: simple, edge spike issue
- •Typical: 100 req/min per user
- •Return 429 Too Many Requests + Retry-After
Authentication
- •JWT: stateless, self-contained tokens
- •OAuth 2.0: third-party authorization
- •API Key: simple, for server-to-server
- •Session: stateful, stored server-side
Idempotency
- •GET, PUT, DELETE are naturally idempotent
- •POST: use Idempotency-Key header
- •Critical for: payments, orders, bookings
- •Store request hash → return cached response
Scaling Rules of ThumbArchitecture
Server Capacity
- •1 server: 10K-50K QPS (depends on workload)
- •1 Redis: 100K+ QPS
- •1 PostgreSQL: 5K-10K QPS (indexed queries)
- •1 Kafka cluster: 1M+ msg/sec
- •1 Elasticsearch: 10K-50K search/sec
When to Scale
- •>70% CPU → add servers
- •>10K QPS → add caching
- •>1TB data → consider sharding
- •>100ms p99 → optimize or cache
- •>1000 DB connections → add connection pooler
Scaling Sequence
- •1. Vertical scaling (bigger machine)
- •2. Caching (Redis for reads)
- •3. Read replicas (distribute reads)
- •4. CDN (static + cacheable content)
- •5. Horizontal scaling (more app servers)
- •6. Sharding (distribute writes + data)
- •7. Async processing (queues)
- •8. Microservices (if necessary)
The Scaling Interview
- •Always start simple, then scale incrementally
- •Justify each scaling decision with numbers
- •Mention monitoring at every layer
- •Trade-offs: cost, complexity, consistency