SWE Interview Prep Notes
System design
Section titled “System design”Delivery framework
Section titled “Delivery framework”- Requirements (5m): get a clear understanding of the system that you are being
asked to design
- Functional requirements: basically features
- Non-functional requirements: FCC + SLEDS:
- F ault Tolerance
- C AP
- C ompliance
- S calability
- L atency
- E nvironment
- D urability
- S ecurity
- Capacity estimation: explain to the interviewer that you would like to skip on estimations upfront and that you will do math while designing when/if necessary
- Core entities (2m): list the core entities of your system
- Start with a small list, you can quickly iterate and add to it as you go
- API interface (5m): define the contract between your system and its users
- Quick decision: which protocol to use? REST, gRPC, GraphQL, etc -> REST most of the time
- (Optional) Data flow (5m): describe the high level sequence of actions or
processes that the system performs on the inputs to produce the desired
outputs
- Skip if system doesn’t involve a long sequence of actions
- High level design (10-15m): drawing boxes and arrows to represent the
different components of your system and how they interact
- Components are basic building blocks like servers, databases, caches, etc
- Primary goal is to design an architecture that satisfies the API you’ve designed, and thus the requirements you’ve set out
- Callout: Don’t waste your time documenting every column/field in your schema. For example, your interviewer knows that a User table has a name, email, and password hash so you don’t need to write these down. Instead, focus on the columns/fields that are particularly relevant to your design.
- Deep dives (10m): harden your design by
- Ensuring it meets all of your non-functional requirements
- Addressing edge cases
- Identifying and addressing issues and bottlenecks
- Improving the design based on probes from your interviewer
Core concepts
Section titled “Core concepts”Scaling
Section titled “Scaling”Work distribution
Section titled “Work distribution”Method to distribute work across many machines while minimizing redistribution when the number of servers changes. Foundational and applicable to many components (databases, caches, message brokers, etc.)
Used in Apache Cassandra, Amazon DynamoDB, CDNs, etc.
ZooKeeper (Netflix), etcd (CoreOS), Consul (HashiCorp): coordination services that serve hash rings to services that require them
Services subscribe to pieces of config that they’re interested in, and get notified when the config changes through the coordination service’s broadcast mechanism
Sharding
Section titled “Sharding”Data distribution
Section titled “Data distribution”Partition your data such that a node can access the data it needs without reaching out to other nodes. This leads to to “scatter-gather” anti-pattern. It causes lots of network traffic, sensitive to failures, and suffers from tail latency.
CAP Theorem
Section titled “CAP Theorem”Locking
Section titled “Locking”Optimistic concurrency control
Distributed locks
Section titled “Distributed locks”Indexing
Section titled “Indexing”Communication protocols
Section titled “Communication protocols”Security
Section titled “Security”Authentication/Authorization
Section titled “Authentication/Authorization”Encryption
Section titled “Encryption”Data protection
Section titled “Data protection”Monitoring
Section titled “Monitoring”Infrastructure monitoring
Section titled “Infrastructure monitoring”Service-level monitoring
Section titled “Service-level monitoring”Application-level monitoring
Section titled “Application-level monitoring”Move Fast