Cheat Cards

Printable reference cards for quick review before your interview. Each card covers one essential topic.

Estimation NumbersMath

Time

•1 day = 86,400 sec ≈ 10⁵
•1 month = 2.6M sec ≈ 2.5 × 10⁶
•1 year = 31.5M sec ≈ 3 × 10⁷

Storage

•1 char = 1 byte (ASCII), 2-4 bytes (UTF-8)
•1 int = 4 bytes, 1 long = 8 bytes
•1 UUID = 16 bytes
•1 KB = 10³ B, 1 MB = 10⁶, 1 GB = 10⁹, 1 TB = 10¹²
•1 image ≈ 300 KB, 1 video min ≈ 50 MB

Users

•DAU / MAU ratio ≈ 0.3-0.5
•Peak traffic ≈ 2-5× average
•QPS = DAU × actions/day ÷ 86,400

Latency

•L1 cache: 1 ns
•L2 cache: 4 ns
•RAM: 100 ns
•SSD random: 100 μs
•HDD seek: 10 ms
•Same DC RTT: 0.5 ms
•Cross-region: 150 ms

Database SelectionStorage

SQL (PostgreSQL, MySQL)

•ACID transactions, strong consistency
•Complex joins, aggregations
•Structured data with relationships
•Up to ~10TB comfortably, ~10K QPS

NoSQL Document (MongoDB)

•Flexible schema, nested documents
•Horizontal scaling built-in
•No joins (denormalized data)
•Good for: catalogs, user profiles, CMS

Key-Value (Redis, DynamoDB)

•Sub-millisecond reads (Redis: in-memory)
•Simple get/set by key
•Good for: cache, sessions, counters
•Redis: 100K+ QPS single node

Wide-Column (Cassandra)

•Write-optimized (LSM tree)
•Distributed, no single point of failure
•Good for: time-series, IoT, chat messages
•High write throughput, eventual consistency

Decision Rule of Thumb

•Need joins? → SQL
•Need speed? → Redis
•Need writes? → Cassandra
•Need search? → Elasticsearch
•Need graphs? → Neo4j

Caching StrategyPerformance

Patterns

•Cache-Aside: app checks cache → miss → DB → populate cache
•Write-Through: write cache + DB together
•Write-Behind: write cache, async flush to DB
•Read-Through: cache auto-fetches on miss

Eviction Policies

•LRU: remove least recently used (most common)
•LFU: remove least frequently used
•TTL: auto-expire after time period
•FIFO: first in, first out

Common Issues

•Thundering herd: use locks + staggered TTL
•Cache penetration: bloom filter for non-existent keys
•Inconsistency: use TTL + event-driven invalidation
•Cold start: pre-warm cache on deploy

Key Numbers

•Redis: 100K+ ops/sec, <1ms latency
•Memcached: 1M+ ops/sec (simple get/set)
•Target cache hit ratio: >90%
•80/20 rule: 20% of data serves 80% of requests

Load BalancingNetworking

Algorithms

•Round Robin: simple rotation
•Weighted RR: proportional to capacity
•Least Connections: route to least busy
•IP Hash: consistent routing by client
•Random: simple but surprisingly effective

L4 vs L7

•L4 (Transport): TCP/UDP level, fast, no inspection
•L7 (Application): HTTP aware, path/header routing
•L4 for: WebSocket, database proxy, raw TCP
•L7 for: microservice routing, A/B testing, SSL termination

Health Checks

•HTTP GET /health → 200 OK
•Interval: 10-30 seconds
•Failure threshold: 2-3 consecutive failures
•Recovery: 2-3 consecutive successes

Common Products

•Cloud: AWS ALB/NLB, GCP LB, Azure LB
•Software: Nginx, HAProxy, Envoy
•DNS-based: Route53, Cloudflare

Message QueuesArchitecture

When to Use

•Async processing (email, notifications, image resize)
•Decouple services for independent scaling
•Buffer traffic spikes (absorb bursts)
•Fan-out: one event → multiple consumers
•Guaranteed delivery (at-least-once)

Kafka vs RabbitMQ vs SQS

•Kafka: ordered log, high throughput, replay
•RabbitMQ: traditional queue, routing, low latency
•SQS: managed, no ops, serverless friendly
•Kafka for: event streaming, analytics, logs
•RabbitMQ for: task queues, RPC patterns

Key Concepts

•At-least-once: may deliver duplicates → consumers must be idempotent
•At-most-once: may lose messages (rarely used)
•Exactly-once: hard! Use idempotency + dedup
•Partition: ordered within partition, parallel across
•Consumer group: each message processed by one consumer per group

Numbers

•Kafka: 1M+ msg/sec per cluster
•SQS: 3000 msg/sec per queue (standard)
•RabbitMQ: 50K msg/sec per cluster
•Always monitor: queue depth, consumer lag

Consistency ModelsDistributed

CAP Theorem

•C: Consistency (every read gets latest write)
•A: Availability (every request gets a response)
•P: Partition tolerance (system works despite network splits)
•Must choose CP or AP during partition

Models (Strongest → Weakest)

•Linearizable: real-time ordering (slowest)
•Sequential: global total order
•Causal: respects cause-effect relationships
•Eventual: all replicas converge eventually (fastest)

When to Use What

•Strong: banking, inventory (prevent overselling)
•Eventual: social media feeds, likes, views
•Causal: messaging (see messages in order)
•Most systems: eventual consistency is fine!

Techniques

•Quorum: W+R > N (e.g., W=2,R=2,N=3)
•Vector clocks: track causal ordering
•CRDTs: conflict-free replicated data types
•2PC: two-phase commit for distributed transactions
•Saga: compensating transactions for microservices

API Design PatternsAPI

REST Best Practices

•Nouns for resources: /users, /posts/{id}
•HTTP verbs: GET/POST/PUT/PATCH/DELETE
•Status codes: 200, 201, 400, 401, 404, 500
•Pagination: cursor-based (not offset)
•Versioning: /v1/users or Accept header

Rate Limiting

•Token Bucket: smooth, allows bursts
•Sliding Window: precise, more memory
•Fixed Window: simple, edge spike issue
•Typical: 100 req/min per user
•Return 429 Too Many Requests + Retry-After

Authentication

•JWT: stateless, self-contained tokens
•OAuth 2.0: third-party authorization
•API Key: simple, for server-to-server
•Session: stateful, stored server-side

Idempotency

•GET, PUT, DELETE are naturally idempotent
•POST: use Idempotency-Key header
•Critical for: payments, orders, bookings
•Store request hash → return cached response

Scaling Rules of ThumbArchitecture

Server Capacity

•1 server: 10K-50K QPS (depends on workload)
•1 Redis: 100K+ QPS
•1 PostgreSQL: 5K-10K QPS (indexed queries)
•1 Kafka cluster: 1M+ msg/sec
•1 Elasticsearch: 10K-50K search/sec

When to Scale

•>70% CPU → add servers
•>10K QPS → add caching
•>1TB data → consider sharding
•>100ms p99 → optimize or cache
•>1000 DB connections → add connection pooler

Scaling Sequence

•1. Vertical scaling (bigger machine)
•2. Caching (Redis for reads)
•3. Read replicas (distribute reads)
•4. CDN (static + cacheable content)
•5. Horizontal scaling (more app servers)
•6. Sharding (distribute writes + data)
•7. Async processing (queues)
•8. Microservices (if necessary)

The Scaling Interview

•Always start simple, then scale incrementally
•Justify each scaling decision with numbers
•Mention monitoring at every layer
•Trade-offs: cost, complexity, consistency