Cheat Cards

Printable reference cards for quick review before your interview. Each card covers one essential topic.

Estimation NumbersMath

Time

  • 1 day = 86,400 sec ≈ 10⁵
  • 1 month = 2.6M sec ≈ 2.5 × 10⁶
  • 1 year = 31.5M sec ≈ 3 × 10⁷

Storage

  • 1 char = 1 byte (ASCII), 2-4 bytes (UTF-8)
  • 1 int = 4 bytes, 1 long = 8 bytes
  • 1 UUID = 16 bytes
  • 1 KB = 10³ B, 1 MB = 10⁶, 1 GB = 10⁹, 1 TB = 10¹²
  • 1 image ≈ 300 KB, 1 video min ≈ 50 MB

Users

  • DAU / MAU ratio ≈ 0.3-0.5
  • Peak traffic ≈ 2-5× average
  • QPS = DAU × actions/day ÷ 86,400

Latency

  • L1 cache: 1 ns
  • L2 cache: 4 ns
  • RAM: 100 ns
  • SSD random: 100 μs
  • HDD seek: 10 ms
  • Same DC RTT: 0.5 ms
  • Cross-region: 150 ms
Database SelectionStorage

SQL (PostgreSQL, MySQL)

  • ACID transactions, strong consistency
  • Complex joins, aggregations
  • Structured data with relationships
  • Up to ~10TB comfortably, ~10K QPS

NoSQL Document (MongoDB)

  • Flexible schema, nested documents
  • Horizontal scaling built-in
  • No joins (denormalized data)
  • Good for: catalogs, user profiles, CMS

Key-Value (Redis, DynamoDB)

  • Sub-millisecond reads (Redis: in-memory)
  • Simple get/set by key
  • Good for: cache, sessions, counters
  • Redis: 100K+ QPS single node

Wide-Column (Cassandra)

  • Write-optimized (LSM tree)
  • Distributed, no single point of failure
  • Good for: time-series, IoT, chat messages
  • High write throughput, eventual consistency

Decision Rule of Thumb

  • Need joins? → SQL
  • Need speed? → Redis
  • Need writes? → Cassandra
  • Need search? → Elasticsearch
  • Need graphs? → Neo4j
Caching StrategyPerformance

Patterns

  • Cache-Aside: app checks cache → miss → DB → populate cache
  • Write-Through: write cache + DB together
  • Write-Behind: write cache, async flush to DB
  • Read-Through: cache auto-fetches on miss

Eviction Policies

  • LRU: remove least recently used (most common)
  • LFU: remove least frequently used
  • TTL: auto-expire after time period
  • FIFO: first in, first out

Common Issues

  • Thundering herd: use locks + staggered TTL
  • Cache penetration: bloom filter for non-existent keys
  • Inconsistency: use TTL + event-driven invalidation
  • Cold start: pre-warm cache on deploy

Key Numbers

  • Redis: 100K+ ops/sec, <1ms latency
  • Memcached: 1M+ ops/sec (simple get/set)
  • Target cache hit ratio: >90%
  • 80/20 rule: 20% of data serves 80% of requests
Load BalancingNetworking

Algorithms

  • Round Robin: simple rotation
  • Weighted RR: proportional to capacity
  • Least Connections: route to least busy
  • IP Hash: consistent routing by client
  • Random: simple but surprisingly effective

L4 vs L7

  • L4 (Transport): TCP/UDP level, fast, no inspection
  • L7 (Application): HTTP aware, path/header routing
  • L4 for: WebSocket, database proxy, raw TCP
  • L7 for: microservice routing, A/B testing, SSL termination

Health Checks

  • HTTP GET /health → 200 OK
  • Interval: 10-30 seconds
  • Failure threshold: 2-3 consecutive failures
  • Recovery: 2-3 consecutive successes

Common Products

  • Cloud: AWS ALB/NLB, GCP LB, Azure LB
  • Software: Nginx, HAProxy, Envoy
  • DNS-based: Route53, Cloudflare
Message QueuesArchitecture

When to Use

  • Async processing (email, notifications, image resize)
  • Decouple services for independent scaling
  • Buffer traffic spikes (absorb bursts)
  • Fan-out: one event → multiple consumers
  • Guaranteed delivery (at-least-once)

Kafka vs RabbitMQ vs SQS

  • Kafka: ordered log, high throughput, replay
  • RabbitMQ: traditional queue, routing, low latency
  • SQS: managed, no ops, serverless friendly
  • Kafka for: event streaming, analytics, logs
  • RabbitMQ for: task queues, RPC patterns

Key Concepts

  • At-least-once: may deliver duplicates → consumers must be idempotent
  • At-most-once: may lose messages (rarely used)
  • Exactly-once: hard! Use idempotency + dedup
  • Partition: ordered within partition, parallel across
  • Consumer group: each message processed by one consumer per group

Numbers

  • Kafka: 1M+ msg/sec per cluster
  • SQS: 3000 msg/sec per queue (standard)
  • RabbitMQ: 50K msg/sec per cluster
  • Always monitor: queue depth, consumer lag
Consistency ModelsDistributed

CAP Theorem

  • C: Consistency (every read gets latest write)
  • A: Availability (every request gets a response)
  • P: Partition tolerance (system works despite network splits)
  • Must choose CP or AP during partition

Models (Strongest → Weakest)

  • Linearizable: real-time ordering (slowest)
  • Sequential: global total order
  • Causal: respects cause-effect relationships
  • Eventual: all replicas converge eventually (fastest)

When to Use What

  • Strong: banking, inventory (prevent overselling)
  • Eventual: social media feeds, likes, views
  • Causal: messaging (see messages in order)
  • Most systems: eventual consistency is fine!

Techniques

  • Quorum: W+R > N (e.g., W=2,R=2,N=3)
  • Vector clocks: track causal ordering
  • CRDTs: conflict-free replicated data types
  • 2PC: two-phase commit for distributed transactions
  • Saga: compensating transactions for microservices
API Design PatternsAPI

REST Best Practices

  • Nouns for resources: /users, /posts/{id}
  • HTTP verbs: GET/POST/PUT/PATCH/DELETE
  • Status codes: 200, 201, 400, 401, 404, 500
  • Pagination: cursor-based (not offset)
  • Versioning: /v1/users or Accept header

Rate Limiting

  • Token Bucket: smooth, allows bursts
  • Sliding Window: precise, more memory
  • Fixed Window: simple, edge spike issue
  • Typical: 100 req/min per user
  • Return 429 Too Many Requests + Retry-After

Authentication

  • JWT: stateless, self-contained tokens
  • OAuth 2.0: third-party authorization
  • API Key: simple, for server-to-server
  • Session: stateful, stored server-side

Idempotency

  • GET, PUT, DELETE are naturally idempotent
  • POST: use Idempotency-Key header
  • Critical for: payments, orders, bookings
  • Store request hash → return cached response
Scaling Rules of ThumbArchitecture

Server Capacity

  • 1 server: 10K-50K QPS (depends on workload)
  • 1 Redis: 100K+ QPS
  • 1 PostgreSQL: 5K-10K QPS (indexed queries)
  • 1 Kafka cluster: 1M+ msg/sec
  • 1 Elasticsearch: 10K-50K search/sec

When to Scale

  • >70% CPU → add servers
  • >10K QPS → add caching
  • >1TB data → consider sharding
  • >100ms p99 → optimize or cache
  • >1000 DB connections → add connection pooler

Scaling Sequence

  • 1. Vertical scaling (bigger machine)
  • 2. Caching (Redis for reads)
  • 3. Read replicas (distribute reads)
  • 4. CDN (static + cacheable content)
  • 5. Horizontal scaling (more app servers)
  • 6. Sharding (distribute writes + data)
  • 7. Async processing (queues)
  • 8. Microservices (if necessary)

The Scaling Interview

  • Always start simple, then scale incrementally
  • Justify each scaling decision with numbers
  • Mention monitoring at every layer
  • Trade-offs: cost, complexity, consistency