Kafka is fast
1. Batch Processing: Kafka groups messages together into batches, which reduces the network calls and disk I/O operations, thereby increasing throughput.
2. Sequential I/O: Kafka stores messages in a sequential, ordered manner on disk. This takes advantage of the underlying file system and disk's fast sequential read and write capabilities.
3. Zero Copy: Kafka uses the zero-copy technique to transfer data directly from the file system cache to network buffers, which reduces CPU usage.
4. Distributed System: Kafka is a distributed system that can be scaled horizontally by adding more nodes to the cluster. This allows it to handle high volumes of data and traffic.
5. Immutable Messages: Once written, messages in Kafka are immutable and are not updated, which simplifies the system and improves performance.
6. In-Memory Offsets: Kafka stores the offsets of messages in memory, which makes reading messages very fast. |
System Design Interview: A Step-By-Step Guide
Cache Systems Every Developer Should Know
type a URL into your browser
type a URL into your browser?
1. URL Parsing: The browser parses the URL to extract the protocol (e.g., HTTP, HTTPS), domain name (e.g., www.example.com), and any additional path or query parameters.
2. DNS Lookup: The browser checks its cache to find the IP address associated with the domain name. If not found, it sends a DNS (Domain Name System) lookup request to a DNS server to obtain the IP address.
3. TCP Connection: The browser establishes a TCP (Transmission Control Protocol) connection with the web server at the obtained IP address. This involves a three-way handshake to establish a reliable connection.
4. HTTP Request: The browser sends an HTTP (Hypertext Transfer Protocol) request to the web server. The request includes the HTTP method (e.g., GET, POST), the requested resource (e.g., /index.html), headers (e.g., user-agent, cookies), and other optional data.
5. Server Processing: The web server receives the HTTP request and processes it. This may involve accessing databases, executing server-side code, or retrieving files from the server's file system.
6. HTTP Response: The web server generates an HTTP response containing the requested resource, along with headers and a status code. The response may also include cookies, caching directives, or other metadata.
7. Data Transfer: The server sends the HTTP response back to the browser over the established TCP connection. The response is divided into packets and transmitted over the network.
8. Browser Rendering: The browser receives the response and begins rendering the web page. It parses the HTML, executes any embedded scripts, and fetches additional resources like CSS stylesheets, images, or JavaScript files referenced in the HTML.
9. Page Display: The browser combines the received resources to render and display the web page to the user. This includes rendering text, applying styles, and executing JavaScript code to add interactivity. |
type of different databases
HTTP/1.1, HTTP/2, and HTTP/3
5 Inter-Process Communications
Linux file system explained
How GraphQL Works in Linkedln?
Things to consider when using cache
๐๐ฎ๐ข๐ญ๐๐๐ฅ๐ ๐๐๐๐ง๐๐ซ๐ข๐จ๐ฌ:
- In-memory solution
- Read heavy system
- Data is not frequently updated
๐๐๐๐ก๐ข๐ง๐ ๐๐๐๐ก๐ง๐ข๐ช๐ฎ๐๐ฌ:
- Cache aside
- Write-through
- Read-through
- Write-around
- Write-back
๐๐๐๐ก๐ ๐๐ฏ๐ข๐๐ญ๐ข๐จ๐ง ๐๐ฅ๐ ๐จ๐ซ๐ข๐ญ๐ก๐ฆ๐ฌ:
- Least Recently Used (LRU)
- Least Frequently Used (LFU)
- First-in First-out (FIFO)
- Random Replacement (RR)
๐๐๐ฒ ๐๐๐ญ๐ซ๐ข๐๐ฌ:
- Cache Hit Ratio
- Latency
- Throughput
- Invalidation Rate
- Memory Usage
- CPU usage
- Network usage
๐๐ญ๐ก๐๐ซ ๐ข๐ฌ๐ฌ๐ฎ๐๐ฌ:
- Thunder herd on cold start
- Time-to-live (TTL) |
which type of database to use
Online Ticketing Platform
The China Train tickets booking system has similar challenges as the Ticketmaster system:
Very high concurrent visits during peak hours.
The QPS for checking remaining tickets and orders is very high
A lot of bots
The solutions
Separate read and write requests. Because anxious users kept refreshing the web page to check if there were tickets available, the system could under huge pressure.
To handle the calculation and query in memory, the remaining ticket components were moved entirely to GemFire. It is possible to fit the entire country's train tickets into several Gigabytes of memory.
In addition, the order query component was moved to GemFire to reduce the load on the order database. Hadoop was used to store historical orders.
Leverage public cloud for elastic capacity.
Ban bots. It reduced the traffic by 95%.
Increase the bandwidth of the system.
Increase system availability by setting up more data centers in different cities.
Design multiple emergency plans. |
How will you design the Stack Overflow website?
What people think it should look like
The interviewer is probably expecting something on the left side.
Microservice is used to decompose the system into small components.
Each service has its own database. Use cache heavily.
The service is sharded.
The services talk to each other asynchronously through message queues.
The service is implemented using Event Sourcing with CQRS.
Showing off knowledge in distributed systems such as eventual consistency, CAP theorem, etc.
What it actually is
Stack Overflow serves all the traffic with only 9 on-premise web servers, and itโs on monolith! It has its own servers and does not run on the cloud. |
Secret Sauce Behind NoSQL: LSM Tree
Design patterns cheat sheet
Design patterns cheat sheet2
scan pay
Here are the steps for generating the QR code:
When you want to pay for your shopping, the cashier tallies up all the goods and calculates the total amount due, for example, $123.45. The checkout has an order ID of SN129803. The cashier clicks the โcheckoutโ button.
The cashierโs computer sends the order ID and the amount to PSP.
The PSP saves this information to the database and generates a QR code URL.
PSPโs Payment Gateway service reads the QR code URL.
The payment gateway returns the QR code URL to the merchantโs computer.
The merchantโs computer sends the QR code URL (or image) to the checkout counter.
The checkout counter displays the QR code.
These 7 steps are completed in less than a second. Now itโs the consumerโs turn to pay from their digital wallet by scanning the QR code:
The consumer opens their digital wallet app to scan the QR code.
After confirming the amount is correct, the client clicks the โpayโ button.
The digital wallet App notifies the PSP that the consumer has paid the given QR code.
The PSP payment gateway marks this QR code as paid and returns a success message to the consumerโs digital wallet App.
The PSP payment gateway notifies the merchant that the consumer has paid the given QR code.
Over to you: I have detailed how to pay using a dynamic QR code. It is dynamic because the QR code is dynamically generated each time. But sometimes, you could pay by scanning a printed QR code in a merchantโs shop, which is called the static QR code. Do you know how a static QR code works? |
Linux file permission illustrated.
How to store passwords safely
avoid crawling duplicate URLs at Google scale
Option 1: Use a Set data structure to check if a URL already exists or not. Set is fast, but it is not space-efficient.
Option 2: Store URLs in a database and check if a new URL is in the database. This can work but the load to the database will be very high.
Option 3: ๐๐ฅ๐จ๐จ๐ฆ ๐๐ข๐ฅ๐ญ๐๐ซ. This option is preferred. Bloom filter was proposed by Burton Howard Bloom in 1970. It is a probabilistic data structure, that is used to test whether an element is a member of a set.
๐น false: the element is definitely not in the set.
๐น true: the element is probably in the set.
False-positive matches are possible, but false negatives are not.
The diagram below illustrates how the Bloom filter works. The basic data structure for the Bloom filter is Bit Vector. Each bit represents a hashed value. |
scale a website to support millions of users
Step 1 - With the growth of the user base, one single application server cannot handle the traffic anymore. We put the application server and the database server into two separate servers.
Step 2 - The business continues to grow, and a single application server is no longer enough. So we deploy a cluster of application servers.
Step 3 - Now the incoming requests have to be routed to multiple application servers, how can we ensure each application server gets an even load? The load balancer handles this nicely.
Step 4 - With the business continuing to grow, the database might become the bottleneck. To mitigate this, we separate reads and writes in a way that frequent read queries go to read replicas. With this setup, the throughput for the database writes can be greatly increased.
Step 5 - Suppose the business continues to grow. One single database cannot handle the load on both the inventory table and user table. We have a few options:
1. Vertical scaling. Adding more power (CPU, RAM, etc.) to the database server. It has a hard limit.
2. Horizontal scaling by adding more database servers.
3. Adding a caching layer to offload read requests.
Step 6 - Now we can modularize the functions into different services. The architecture becomes service-oriented / microservice. |
S3-like storage system
In S3, an object resides in a bucket. The path looks like this: /bucket-to-share/script.txt. The bucket only has metadata. The object has metadata and the actual data.
The diagram below (Figure 2) illustrates how file uploading works. In this example, we first create a bucket named โbucket-to-shareโ and then upload a file named โscript.txtโ to the bucket.
1. The client sends an HTTP PUT request to create a bucket named โbucket-to-share.โ The request is forwarded to the API service.
2. The API service calls Identity and Access Management (IAM) to ensure the user is authorized and has WRITE permission.
3. The API service calls the metadata store to create an entry with the bucket info in the metadata database. Once the entry is created, a success message is returned to the client.
4. After the bucket is created, the client sends an HTTP PUT request to create an object named โscript.txtโ.
5. The API service verifies the userโs identity and ensures the user has WRITE permission on the bucket.
6. Once validation succeeds, the API service sends the object data in the HTTP PUT payload to the data store. The data store persists the payload as an object and returns the UUID of the object.
7. The API service calls the metadata store to create a new entry in the metadata database. It contains important metadata such as the object_id (UUID), bucket_id (which bucket the object belongs to), object_name, etc. |
Match buy and sell stock orders
Stocks go up and down. Do you know what data structure is used to efficiently match buy and sell orders?
Stock exchanges use order books. An order book is an electronic list of buy and sell orders, organized by price levels. It has a buy book and a sell book, where each side of the book contains a bunch of price levels, and each price level contains a list of orders (first in first out).
The diagram below is an example of price levels and the queued quantity at each price level.
So what happens when you place a market order to buy 2700 shares in the diagram?
- The buy order is matched with all the sell orders at price 100.10, and the first order at price 100.11 (illustrated in light red).
- Now because of the big buy order which โeats upโ the first price level on the sell book, the best ask price goes up from 100.10 to 100.11.
- So when the market is bullish, people tend to buy stocks aggressively, and the price goes up and up.
An efficient data structure for an order book must satisfy:
- Constant lookup time. Operations include: get volume at a price level or between price levels, query best bid/ask.
- Fast add/cancel/execute/update operations, preferably O(1) time complexity. Operations include: place a new order, cancel an order, and match an order. |
|
|
Why is single-threaded Redis so fast?
1 in-memory database
2 io multiplexing a single thread to wait on many sockets
3 efficient data structure |
Top 5 Most Used Architecture Patterns
Algorithms You Should Know
Top 7 Ways to 10x Your API Performance
Git MERGE vs REBASE: Everything You Need to Know
SSL, TLS, HTTPS Explained
SSL, TLS, HTTPS Explained
1. Client Hello: The client initiates a secure connection by sending a "Client Hello" message to the server. This message includes information about the client's supported encryption algorithms and other parameters.
2. Server Hello: The server responds with a "Server Hello" message, selecting the strongest encryption algorithm and other parameters that both the client and server support.
3. Certificate Exchange: The server sends its digital certificate to the client. The certificate contains the server's public key, which is used for encryption and authentication.
4. Certificate Validation: The client verifies the authenticity of the server's certificate. It checks if the certificate is issued by a trusted Certificate Authority (CA) and if it has not expired or been revoked.
5. Key Exchange: The client generates a random symmetric encryption key and encrypts it using the server's public key. This encrypted key is sent to the server.
6. Session Key: The server decrypts the encrypted key using its private key and obtains the symmetric encryption key. Both the client and server now have the same session key for secure communication.
7. Secure Communication: The client and server use the session key to encrypt and decrypt data exchanged between them. This ensures that the data transmitted over the network is protected from eavesdropping and tampering.
8. HTTPS Connection: From this point onwards, the client and server communicate using the secure HTTPS protocol. All HTTP requests and responses are encrypted and decrypted using the session key. |
How to Choose a Replication Strategy
1.leader-follower model
2.multi-leader
Multi-leader replication provides high availability but requires careful design around consensus, conflict detection, and resolution mechanisms.
The primary advantage of this model is increased write availability
disadvantage: Managing Conflict:
2.1Last Write Wins
2.2Conflict-free Replicated Data Types (CRDTs)
CRDTs allow for seamless reconciliation of conflicting changes by merging them.
2.3Operational Transformation
Operational transformation is often used in real-time collaborative applications. It takes the operation itself into account, not just the state of the data.
2.4Application-specific Resolution
3.leaderless replication
Quorum Writes and Reads
In this system, we use three important values.
'n' is the total number of nodes in our system.
'W', the write quorum, is the minimum number of nodes that need to agree for a write to be considered successful.
'r', the read quorum, is the minimum number of nodes that need to agree for a read to be valid.
For strong consistency, a general guideline is to have w + r > n. It ensures that any read overlaps with any write and returns the most recent value. |
HTTPS, SSL Handshake, and Data Encryption
CI/CD Pipeline Explained in Simple Terms
Common Rate Limiting Algorithms
Fixed Window Counter
Sliding Window Log
Sliding Window Counter
Token Bucket
Leaky Bucket |
8 Data Structures in Databases
2-factor authenticators
proxy
A forward proxy is a server that sits between a group of client machines and the internet.
A reverse proxy sits between the internet and the web servers. It intercepts the requests from clients and talks to the web server on behalf of the clients. |
iQIYI database selection trees
Five ways to generate distributed unique ID
Why is RESTful API so popular?
What is a File Descriptor?
A file descriptor represents an open file. It is a unique number assigned by the operating system to each file. It is an abstraction for working with files. We need to use file descriptors to read from or write to files in our program. Each process maintains its own file descriptor table.
The diagram below shows the layered architecture in Linux filesystem. Letโs take process 1234 as an example.
๐น In User Space
When we open a file called โfileA.txtโ in Process 1234, we get file descriptor fd1, which is equal to 3. We can then pass the file descriptor to other functions to write data to the file.
๐น In Kernel Space
In Linux kernel, there is a process table to maintain the data for the processes. Each process has an entry in the table. Each process maintains a file descriptor table, with file descriptors as its indices. Notice that file descriptors 0,1 and 2 are reserved in each file descriptor table to represent stdin, stdout, and stderr.
The file pointer points to an entry in the open file table, which has information about open files across all processes. Multiple file descriptors can point to the same file table entry. For example, file descriptor 0, 1 and 2 points to the same open file table entry.
Since different open file table entries can represent the same file, it is a waste of resources to store the file static information so many times. We need another abstraction layer called โvnode tableโ to store the static data.
In each file table entry, there is a vnode pointer, which points to an entry in vnode table. The static information includes file type, function pointers, reference counts, inode etc. inode describes a physical object in the filesystem.
๐น In Filesystem
The inode array element stores the actual file information, including permission mode, owners, timestamps, etc. inode also points to the data blocks stored in the filesystem.
Over to you: When we close a file in a program, do you know which entries are deleted in these data structures? |
notifications pushed to our phones or PCs
How do we design for high availability
Inner, Left, Right, and Full join
Process and Thread
A Process means a program is in execution. When a program is loaded into the memory and becomes active, the program becomes a process. The process requires some essential resources such as registers, program counter, and stack.
A Thread is the smallest unit of execution within a process.
The following process explains the relationship between program, process, and thread.
1. The program contains a set of instructions.
2. The program is loaded into memory. It becomes one or more running processes.
3. When a process starts, it is assigned memory and resources. A process can have one or more threads. For example, in the Microsoft Word app, a thread might be responsible for spelling checking and the other thread for inserting text into the doc.
Main differences between process and thread:
๐น Processes are usually independent, while threads exist as subsets of a process.
๐น Each process has its own memory space. Threads that belong to the same process share the same memory.
๐น A process is a heavyweight operation. It takes more time to create and terminate.
๐น Context switching is more expensive between processes.
๐น Inter-thread communication is faster for threads. |
Worldwide Interbank Financial Telecommunication
Step 1: Bank A sends a message with transfer details to Regional Processor A in New York. The destination is Bank B.
Step 2: Regional processor validates the format and sends it to Slice Processor A. The Regional Processor is responsible for input message validation and output message queuing. The Slice Processor is responsible for storing and routing messages safely.
Step 3: Slice Processor A stores the message.
Step 4: Slice Processor A informs Regional Processor A the message is stored.
Step 5: Regional Processor A sends ACK/NAK to Bank A. ACK means a message will be sent to Bank B. NAK means the message will NOT be sent to Bank B.
Step 6: Slice Processor A sends the message to Regional Processor B in London.
Step 7: Regional Processor B stores the message temporarily.
Step 8: Regional Processor B assigns a unique ID MON (Message Output Number) to the message and sends it to Slice Processor B
Step 9: Slice Processor B validates MON.
Step 10: Slice Processor B authorizes Regional Processor B to send the message to Bank B.
Step 11: Regional Processor B sends the message to Bank B.
Step 12: Bank B receives the message and stores it.
Step 13: Bank B sends UAK/UNK to Regional Processor B. UAK (user positive acknowledgment) means Bank B received the message without error; UNK (user negative acknowledgment) means Bank B received checksum failure.
Step 14: Regional Processor B creates a report based on Bank Bโs response, and sends it to Slice Processor B.
Step 15: Slice Processor B stores the report.
Step 16 - 17: Slice Processor B sends a copy of the report to Slice Processor A. Slice Processor A stores the report. |
Low latency stock exchange
How does a modern stock exchange achieve microsecond latency? The principal is:
Do less on the critical path
- Fewer tasks on the critical path
- Less time on each task
- Fewer network hops
- Less disk usage
For the stock exchange, the critical path is:
- start: an order comes into the order manager
- mandatory risk checks
- the order gets matched and the execution is sent back
- end: the execution comes out of the order manager
Other non-critical tasks should be removed from the critical path.
We put together a design as shown in the diagram:
- deploy all the components in a single giant server (no containers)
- use shared memory as an event bus to communicate among the components, no hard disk
- key components like Order Manager and Matching Engine are single-threaded on the critical path, and each pinned to a CPU so that there is no context switch and no locks
- the single-threaded application loop executes tasks one by one in sequence
- other components listen on the event bus and react accordingly |
|
|
Top 5 Most-Used Deployment Strategies
Secret To Optimizing SQL Queries
To write sargable queries:
ยท Avoid using functions or calculations on indexed columns in the WHERE clause
ยท Use direct comparisons when possible, instead of wrapping the column in a function
ยท If we need to use a function on a column, consider creating a computed column or a function-based index, if the database system supports it |
10 Key Data Structures We Use Every Day
What Are Microservices Really All About?
Bare Metal, Virtual Machines, and Containers
ๅฆไฝ่ฎพ่ฎกไธไธช้ซๅนถๅ็ณป็ป?
1. ่ฟ็จ่ชๅจๆฉ็ผฉๅฎน๏ผ็ๆงๅๆฅ่ญฆ
2. ๆฉๅฑๅพฎๆๅกๆถๆ๏ผๆ นๆฎๅ
ทไฝๆๅก็้ๆฑ่ฟ่ก็ฌ็ซๆฉๅฑ
3. ไผๅ็ผๅญ็ญ็ฅ
4. ็กฎไฟ่ด่ฝฝๅ่กก
5. ๅไฝๅๆ
้่ฝฌ็งป๏ผๆฐๆฎไธญๅฟๆ
้
1. ๆฐดๅนณๆฉๅฑ๏ผHorizontal Scaling๏ผ๏ผ้่ฟๅขๅ ๆๅกๅจๅฎไพๆฅๅๆฃ่ด่ฝฝ๏ผไฝฟ็ณป็ป่ฝๅคๅค็ๆดๅค็ๅนถๅ่ฏทๆฑใ่ฟๅฏไปฅ้่ฟ่ด่ฝฝๅ่กกๅจๆฅๅฎ็ฐ๏ผๅฐ่ฏทๆฑๅๅๅฐๅคไธชๆๅกๅจไธใ
2. ๅค็บง็ผๅญ๏ผCaching๏ผ๏ผไฝฟ็จ็ผๅญๆฅๅญๅจ้ข็น่ฎฟ้ฎ็ๆฐๆฎ๏ผๅๅฐๅฏนๅ็ซฏๆฐๆฎๅบๆๅ
ถไป่ตๆบ็่ฎฟ้ฎๆฌกๆฐใๅธธ่ง็็ผๅญๆๆฏๅ
ๆฌๅ
ๅญ็ผๅญ๏ผๅฆRedis๏ผๅๅๅธๅผ็ผๅญ๏ผๅฆMemcached๏ผใ
3. ๅผๆญฅๅค็๏ผAsynchronous Processing๏ผ๏ผๅฐ่ๆถ็ๆไฝ่ฝฌๅไธบๅผๆญฅไปปๅก๏ผ้่ฟๆถๆฏ้ๅๆไปปๅก้ๅๆฅๅค็ใ่ฟๆ ทๅฏไปฅๅๅฐ่ฏทๆฑ็็ญๅพ
ๆถ้ด๏ผๆ้ซ็ณป็ป็ๅๅ้ใ
4. ๆฐๆฎๅบไผๅ๏ผไฝฟ็จๅ้็ๆฐๆฎๅบๅผๆๅไผๅๆๆฏ๏ผๅฆ็ดขๅผใๅๅบใ็ผๅญๆฅ่ฏข็ปๆ็ญ๏ผไปฅๆ้ซๆฐๆฎๅบ็่ฏปๅๆง่ฝๅๅนถๅๅค็่ฝๅใ
5. ๆ ็ถๆ่ฎพ่ฎก๏ผStateless Design๏ผ๏ผๅฐฝ้่ฎพ่ฎกๆ ็ถๆ็็ณป็ป๏ผๅฐ็ถๆไฟกๆฏๅญๅจๅจๅค้จ๏ผๅฆๆฐๆฎๅบๆ็ผๅญไธญใ่ฟๆ ทๅฏไปฅไฝฟ็ณป็ปๆดๅฎนๆๆฐดๅนณๆฉๅฑ๏ผๅนถๆ้ซ็ณป็ป็ๅฏไผธ็ผฉๆงใ
6. ้ๆตๅ็ๆญ๏ผRate Limiting and Circuit Breaking๏ผ๏ผ้่ฟ้ๅถ่ฏทๆฑ็้็ๆๅจ็ณป็ป่ด่ฝฝ่ฟ้ซๆถๆญๅผ้จๅ่ฏทๆฑ๏ผไปฅไฟๆค็ณป็ปๅ
ๅ่ฟ่ฝฝ็ๅฝฑๅใ
7. ๅๅธๅผๆถๆ๏ผDistributed Architecture๏ผ๏ผๅฐ็ณป็ปๆๅไธบๅคไธช็ฌ็ซ็ๆๅก๏ผ้่ฟๆถๆฏไผ ้ๆAPI่ฐ็จ่ฟ่ก้ไฟกใ่ฟๆ ทๅฏไปฅๆ้ซ็ณป็ป็ๅฏๆฉๅฑๆงๅๅฎน้ๆงใ
8. ็ๆงๅ่ชๅจๅ๏ผๅปบ็ซ็ๆง็ณป็ปๆฅๅฎๆถ็ๆต็ณป็ป็ๆง่ฝๅๅฅๅบท็ถๅต๏ผๅๆถๅ็ฐๅนถ่งฃๅณๆฝๅจ็้ฎ้ขใ่ชๅจๅ้จ็ฝฒๅๆฉๅฑๅฏไปฅๅๅฐไบบๅทฅๅนฒ้ข๏ผๆ้ซ็ณป็ป็ๅฏ้ ๆงๅๅฏ็ปดๆคๆงใ
9. ๅฎน้ๅๆขๅค๏ผFault Tolerance and Recovery๏ผ๏ผ่ฎพ่ฎก็ณป็ปไปฅๅฎนๅฟ้จๅ็ปไปถๆๆๅก็ๆ
้๏ผๅนถ่ฝๅคๅฟซ้ๆขๅคใไฝฟ็จๅคไปฝๅๅไฝๆบๅถๆฅไฟ่ฏ็ณป็ป็ๅฏ็จๆงใ
10. ๆง่ฝๆต่ฏๅไผๅ๏ผ่ฟ่ก็ณป็ป็ๆง่ฝๆต่ฏๅๅบๅๆต่ฏ๏ผๆพๅบ็ถ้ขๅๆง่ฝ็ถ้ข๏ผๅนถ่ฟ่ก็ธๅบ็ไผๅๅ่ฐๆดใ |
There are 3 ways to use Redis as a message queue
There are 3 ways to use Redis as a message queue
Pub/Sub is convenient but has some delivery restrictions. The consumer subscribes to a key and receives the data when a producer publishes data to the same key. The restriction is that the data is delivered at most once. If a consumer was down and didnโt receive the published data, that data is lost. Also, the data is not persisted on disk. If Redis goes down, all Pub/Sub data is lost. Pub/Sub is suitable for metrics monitoring where some data loss is acceptable.
The List data structure in Redis can construct a FIFO (First-In-First-Out) queue. The consumer uses BLPOP to wait for messages in blocking mode, so a timeout should be applied. Consumers waiting on the same List form a consumer group where each message is consumed by only one consumer. As a Redis data structure, List can be persisted to disk.
Stream solves the restrictions of the above two methods. Consumers choose where to read messages from - โ$โ for new messages, โ<id>โ for a specific message id, or โ0-0โ for reading from the start. |
Benefits of Message Queues
Fan-out
Asynchronous Processing
Rate Limiting
Decoupling
Horizontal Scalability
Message Persistence
Batch Processing
Message Ordering |
A comparison between B-Tree vs B+ Tree
9 best practices for developing microservices
data warehouse and a data lake
The 10 Algorithms That Dominate Our World
18 Most-used Linux Commands You Should Know
microservices tech stack
user identity management
WWW-Authenticate is the most basic method. You are asked for the username and password by the browser. As a result of the inability to control the login life cycle, it is seldom used today.
A finer control over the login life cycle is session-cookie. The server maintains session storage, and the browser keeps the ID of the session. A cookie usually only works with browsers and is not mobile app friendly.
To address the compatibility issue, the token can be used. The client sends the token to the server, and the server validates the token. The downside is that the token needs to be encrypted and decrypted, which may be time-consuming.
JWT is a standard way of representing tokens. This information can be verified and trusted because it is digitally signed. Since JWT contains the signature, there is no need to save session information on the server side.
By using SSO (single sign-on), you can sign on only once and log in to multiple websites. It uses CAS (central authentication service) to maintain cross-site information
By using OAuth 2.0, you can authorize one website to access your information on another website |
B-Tree vs. LSM-Tree
B-Tree
B-Tree is the most widely used indexing data structure in almost all relational databases.
The basic unit of information storage in B-Tree is usually called a โpageโ. Looking up a key traces down the range of keys until the actual value is found.
LSM-Tree
LSM-Tree (Log-Structured Merge Tree) is widely used by many NoSQL databases, such as Cassandra, LevelDB, and RocksDB.
LSM-trees maintain key-value pairs and are persisted to disk using a Sorted Strings Table (SSTable), in which the keys are sorted.
Level 0 segments are periodically merged into Level 1 segments. This process is called compaction.
The biggest difference is probably this:
B-Tree enables faster reads
LSM-Tree enables fast writes |
Designing a location-based service
how to make your website rank higher?
A search engine works in 3 stages:
The crawler reads the page content (HTML code) and follows the hyperlink to read more web pages.
The preprocessor also works in 3 steps:
It removes HTML tags and โStopโ words, which are words like โaโ or โanโ or โthe.โ It also removes other noise that is not relevant to the web page's content, for example, the disclaimer.
Then the keywords form structured indices, called forward indices and inverted indices.
The preprocessor calculates the hyperlink relationships, for example, how many hyperlinks are on the web page and how many hyperlinks point to it.
When a user types in a search term, the search engine uses the indices and ranking algorithms to rank the web pages and presents the search results to the user.
How do we make our website rank higher in search results? The diagram below shows some ways to do this.
Optimize website structure:
We need to make it easier for the crawler to crawl our website. Remove anything the crawler cannot read, including flash, frames, and dynamic URLs. Make the website hierarchy less deep, so the web pages are less distant from the main home page.
The URLs must be short and descriptive. Try to include keywords in the URLs, as well. It will also help to use HTTPS. But donโt use underscore in the URL because that will screw up the tokenization.
Choose the keywords to optimize for:
Keywords must be relevant to what the website is selling, and they must have business values. For example, a keyword is considered valuable if itโs a popular search, but has fewer search results.
Optimize the web page
The crawler crawls the HTML contents. Therefore the title and description should be optimized to include keywords and be concise. The body of the web page should include relevant keywords.
Another aspect is the user experience. In May 2020, Google published Core Web Vitals, officially listing user experience as an important factor of page ranking algorithms.
External link
If our website is referenced by a highly-ranked website, it will increase our websiteโs ranking. So carefully building external links is important. Publishing high-quality content on your website which is useful to other users, is a good way to attract external links.
Over to you: What are your top SEO recommendations? |
How to design a secure web API access
Token based
Step 1 - the user enters their password into the client, and the client sends the password to the Authentication Server.
Step 2 - the Authentication Server authenticates the credentials and generates a token with an expiry time.
Steps 3 and 4 - now the client can send requests to access server resources with the token in the HTTP header. This access is valid until the token expires.
HMAC based
This mechanism generates a Message Authentication Code (signature) by using a hash function (SHA256 or MD5).
Steps 1 and 2 - the server generates two keys, one is Public APP ID (public key) and the other one is API Key (private key).
Step 3 - we now generate a HMAC signature on the client side (hmac A). This signature is generated with a set of attributes listed in the diagram.
Step 4 - the client sends requests to access server resources with hmac A in the HTTP header.
Step 5 - the server receives the request which contains the request data and the authentication header. It extracts the necessary attributes from the request and uses the API key thatโs stored on the server side to generate a signature (hmac B.)
Steps 6 and 7 - the server compares hmac A (generated on the client side) and hmac B (generated on the server side). If they are matched, the requested resource will be returned to the client.
Question - How does HMAC authentication ensure data integrity? Why do we include โrequest timestampโ in HMAC signature generation? |
Vertical partitioning vs horizontal partitioning
Proximity service
- Business Service
- Add/delete/update restaurant information
- Customers view restaurant details
- Location-based Service
- Given a radius and location, return a list of nearby restaurants
How are the restaurant locations stored in the database so that LBS can return nearby restaurants efficiently?
Store the latitude and longitude of restaurants in the database? The query will be very inefficient when you need to calculate the distance between you and every restaurant.
One way to speed up the search is using the geohash algorithm.
First, divide the planet into four quadrants along with the prime meridian and equator๏ผ
- Latitude range [-90, 0] is represented by 0
- Latitude range [0, 90] is represented by 1
- Longitude range [-180, 0] is represented by 0
- Longitude range [0, 180] is represented by 1
Second, divide each grid into four smaller grids. Each grid can be represented by alternating between longitude bit and latitude bit.
So when you want to search for the nearby restaurants in the red-highlighted grid, you can write SQL like:
SELECT * FROM geohash_index WHERE geohash LIKE 01%
Geohash has some limitations. There can be a lot of restaurants in one grid (downtown New York), but none in another grid (ocean). So there are other more complicated algorithms to optimize the process. Let me know if you are interested in the details. |
Black Friday flash sale
Design principles:
1. Less is more - less element on the web page, fewer data queries to the database, fewer web requests, fewer system dependencies
2. Short critical path - fewer hops among services or merge into one service
3. Async processing- use message queues to handle high TPS
4. Isolation - isolate static and dynamic contents, isolate processes and databases for rare items
5. Overselling is bad. When to decrease the inventory is important
6. User experience is important. We definitely donโt want to inform users that they have successfully placed orders but later tell them no items are actually available
Happy shopping! If I missed anything, please leave a comment. |
|
|
System Design Blueprint: The Ultimate Guide
Data Structures Used in Daily Life
How does Twitter Recommend "For You" Timeline?
Architecture Characteristics
How do we design effective and safe APIs?
Managing Operational Challenges in Caching
Cache stampede้ชๅดฉ ่ชๅจ็ปญๆ๏ผๅๅธๅผ้๏ผStaggered expiration๏ผConsistent hashing๏ผCircuit breakers๏ผRequest rate limiting and load shedding
|
Cache Penetration ็ฉฟ้ store a placeholder value in the cache to represent non-existent data๏ผ bloom filter
|
Hot key problem conduct real-time traffic analysis to promptly detect emerging hot keys๏ผsplit hot key into small keys and distribute, real-time monitoring can enable the cache system to expand quickly, store hot keys in a local cache, decreasing traffic to the remote cache system.
|
Large key problem 1. Data Sharding 2. Hashing or Digest 3. External Storage 4. Data Compression 5. Database or Document Store
|
Managing Operational Challenges in Caching
Cache stampede้ชๅดฉ Staggered expiration๏ผConsistent hashing๏ผCircuit breakers๏ผRequest rate limiting and load shedding
|
Cache Penetration ็ฉฟ้ store a placeholder value in the cache to represent non-existent data๏ผ bloom filter
|
Hot key problem conduct real-time traffic analysis to promptly detect emerging hot keys๏ผsplit hot key into small keys and distribute, real-time monitoring can enable the cache system to expand quickly, store hot keys in a local cache, decreasing traffic to the remote cache system.
|
Large key problem 1. Data Sharding 2. Hashing or Digest 3. External Storage 4. Data Compression 5. Database or Document Store
|
DevOps vs. SRE vs. Platform Engineering
A Crash Course in Caching
What are the API architectural styles?
REST
Proposed in 2000, REST is the most used style. It is often used between front-end clients and back-end services. It is compliant with 6 architectural constraints. The payload format can be JSON, XML, HTML, or plain text.
GraphQL
GraphQL was proposed in 2015 by Meta. It provides a schema and type system, suitable for complex systems where the relationships between entities are graph-like. For example, in the diagram below, GraphQL can retrieve user and order information in one call, while in REST this needs multiple calls.
GraphQL is not a replacement for REST. It can be built upon existing REST services.
Web socket
Web socket is a protocol that provides full-duplex communications over TCP. The clients establish web sockets to receive real-time updates from the back-end services. Unlike REST, which always โpullsโ data, web socket enables data to be โpushedโ.
Webhook
Webhooks are usually used by third-party asynchronous API calls. In the diagram below, for example, we use Stripe or Paypal for payment channels and register a webhook for payment results. When a third-party payment service is done, it notifies the payment service if the payment is successful or failed. Webhook calls are usually part of the systemโs state machine.
gRPC
Released in 2016, gRPC is used for communications among microservices. gRPC library handles encoding/decoding and data transmission.
SOAP
SOAP stands for Simple Object Access Protocol. Its payload is XML only, suitable for communications between internal systems. |
a load balancer and an API gateway
design a chat application
API architectural styles
How to scale from 0 to millions of users
Load balancer
A load balancer evenly distributes incoming traffic among web servers that are defined in a load-balanced set.
Web servers
Web server returns HTML pages or JSON response for rendering.
Databases: vertical scaling and horizontal scaling
Cache
A cache is a temporary storage area that stores the result of expensive responses or frequently accessed data in memory so that subsequent requests are served more quickly.
CDN
A CDN is a network of geographically dispersed servers used to deliver static content. CDN servers cache static content like images, videos, CSS, JavaScript files, etc.
Message queue
A message queue is a durable component, stored in memory, that supports asynchronous communication.
Logging, metrics, automation
When working with a small website that runs on a few servers, logging, metrics, and automation support are good practices but not a necessity. However, now that your site has grown to serve a large business, investing in those tools is essential. |
Live streaming explained
Step 1: The raw video data is captured by a microphone and camera. The data is sent to the server side.
Step 2: The video data is compressed and encoded. For example, the compressing algorithm separates the background and other video elements. After compression, the video is encoded to standards such as H.264. The size of the video data is much smaller after this step.
Step 3: The encoded data is divided into smaller segments, usually seconds in length, so it takes much less time to download or stream.
Step 4: The segmented data is sent to the streaming server. The streaming server needs to support different devices and network conditions. This is called โAdaptive Bitrate Streaming.โ This means we need to produce multiple files at different bitrates in steps 2 and 3.
Step 5: The live streaming data is pushed to edge servers supported by CDN (Content Delivery Network.) Millions of viewers can watch the video from an edge server nearby. CDN significantly lowers data transmission latency.
Step 6: The viewersโ devices decode and decompress the video data and play the video in a video player.
Steps 7 and 8: If the video needs to be stored for replay, the encoded data is sent to a storage server, and viewers can request a replay from it later.
Standard protocols for live streaming include:
RTMP (Real-Time Messaging Protocol): This was originally developed by Macromedia to transmit data between a Flash player and a server. Now it is used for streaming video data over the internet. Note that video conferencing applications like Skype use RTC (Real-Time Communication) protocol for lower latency.
HLS (HTTP Live Streaming): It requires the H.264 or H.265 encoding. Apple devices accept only HLS format.
DASH (Dynamic Adaptive Streaming over HTTP): DASH does not support Apple devices.
Both HLS and DASH support adaptive bitrate streaming.
Over to you: What are some of the optimizations that can be done in this process? Which type of storage is suitable for video persistence in step 7 |
Algorithms you should know
database isolation levels
๐นSerializalble: This is the highest isolation level. Concurrent transactions are guaranteed to be executed in sequence.
๐นRepeatable Read: Data read during the transaction stays the same as the transaction starts.
๐นRead Committed: Data modification can only be read after the transaction is committed.
๐นRead Uncommitted: The data modification can be read by other transactions before a transaction is committed.
The isolation is guaranteed by MVCC (Multi-Version Consistency Control) and locks.
The diagram below takes Repeatable Read as an example to demonstrate how MVCC works:
There are two hidden columns for each row: transaction_id and roll_pointer. When transaction A starts, a new Read View with transaction_id=201 is created. Shortly afterward, transaction B starts, and a new Read View with transaction_id=202 is created.
Now transaction A modifies the balance to 200, a new row of the log is created, and the roll_pointer points to the old row. Before transaction A commits, transaction B reads the balance data. Transaction B finds that transaction_id 201 is not committed, it reads the next committed record(transaction_id=200).
Even when transaction A commits, transaction B still reads data based on the Read View created when transaction B starts. So transaction B always reads the data with balance=100. |
Why is a solid-state drive (SSD) fast?
Step 1: โCommands come from the user through the host interfaceโ [2]. The interface can be Serial ATA (SATA) or PCI Express (PCIe).
Step 2: โThe processor in the SSD controller takes the commands and passes them to the flash controllerโ [2].
Step 3: โSSDs also have embedded RAM memory, generally for caching purposes and to store mapping informationโ [2].
Step 4: โThe packages of NAND flash memory are organized in gangs, over multiple channelsโ [2].
The second diagram illustrates how the logical and physical pages are mapped, and why this architecture is fast.
SSD controller operates multiple FLASH particles in parallel, greatly improving the underlying bandwidth. When we need to write more than one page, the SSD controller can write them in parallel [3], whereas the HDD has a single head and it can only read from one head at a time.
Every time a HOST Page is written, the SSD controller finds a Physical Page to write the data and this mapping is recorded. With this mapping, the next time HOST reads a HOST Page, the SSD knows where to read the data from FLASH [3]. |
At most once, at least once, exactly once
๐๐ญ-๐ฆ๐จ๐ฌ๐ญ ๐จ๐ง๐๐
As the name suggests, at-most once means a message will be delivered not more than once. Messages may be lost but are not redelivered. This is how at-most once delivery works at the high level.
Use cases: It is suitable for use cases like monitoring metrics, where a small amount of data loss is acceptable.
๐๐ญ-๐ฅ๐๐๐ฌ๐ญ ๐จ๐ง๐๐
With this data delivery semantic, itโs acceptable to deliver a message more than once, but no message should be lost.
Use cases: With at-least once, messages wonโt be lost but the same message might be delivered multiple times. While not ideal from a user perspective, at-least once delivery semantics are usually good enough for use cases where data duplication is not a big issue or deduplication is possible on the consumer side. For example, with a unique key in each message, a message can be rejected when writing duplicate data to the database.
๐๐ฑ๐๐๐ญ๐ฅ๐ฒ ๐จ๐ง๐๐
Exactly once is the most difficult delivery semantic to implement. It is friendly to users, but it has a high cost for the systemโs performance and complexity.
Use cases: Financial-related use cases (payment, trading, accounting, etc.). Exactly once is especially important when duplication is not acceptable and the downstream service or third party doesnโt support idempotency.
Question: what is the difference between message queues vs event streaming platforms such as Kafka, Apache Pulsar, etc? |
At most once, at least once, exactly once
๐๐ญ-๐ฆ๐จ๐ฌ๐ญ ๐จ๐ง๐๐
As the name suggests, at-most once means a message will be delivered not more than once. Messages may be lost but are not redelivered. This is how at-most once delivery works at the high level.
Use cases: It is suitable for use cases like monitoring metrics, where a small amount of data loss is acceptable.
๐๐ญ-๐ฅ๐๐๐ฌ๐ญ ๐จ๐ง๐๐
With this data delivery semantic, itโs acceptable to deliver a message more than once, but no message should be lost.
Use cases: With at-least once, messages wonโt be lost but the same message might be delivered multiple times. While not ideal from a user perspective, at-least once delivery semantics are usually good enough for use cases where data duplication is not a big issue or deduplication is possible on the consumer side. For example, with a unique key in each message, a message can be rejected when writing duplicate data to the database.
๐๐ฑ๐๐๐ญ๐ฅ๐ฒ ๐จ๐ง๐๐
Exactly once is the most difficult delivery semantic to implement. It is friendly to users, but it has a high cost for the systemโs performance and complexity.
Use cases: Financial-related use cases (payment, trading, accounting, etc.). Exactly once is especially important when duplication is not acceptable and the downstream service or third party doesnโt support idempotency.
Question: what is the difference between message queues vs event streaming platforms such as Kafka, Apache Pulsar, etc? |
|