ByteByteGo - Youtube Cheat Sheet by woshiamiaojiang - Download free from Cheatography - Cheatography.com: Cheat Sheets For Every Occasion

Kafka is fast

1. Batch Processing: Kafka groups messages together into batches, which reduces the network calls and disk I/O operations, thereby increasing throughput.

2. Sequential I/O: Kafka stores messages in a sequential, ordered manner on disk. This takes advantage of the underlying file system and disk's fast sequential read and write capabilities.

3. Zero Copy: Kafka uses the zero-copy technique to transfer data directly from the file system cache to network buffers, which reduces CPU usage.

4. Distributed System: Kafka is a distributed system that can be scaled horizontally by adding more nodes to the cluster. This allows it to handle high volumes of data and traffic.

5. Immutable Messages: Once written, messages in Kafka are immutable and are not updated, which simplifies the system and improves performance.

6. In-Memory Offsets: Kafka stores the offsets of messages in memory, which makes reading messages very fast.

System Design Interview: A Step-By-Step Guide

CI/CD In 5 Minutes

Cache Systems Every Developer Should Know

type a URL into your browser

type a URL into your browser?

1. URL Parsing: The browser parses the URL to extract the protocol (e.g., HTTP, HTTPS), domain name (e.g., www.example.com), and any additional path or query parameters.

2. DNS Lookup: The browser checks its cache to find the IP address associated with the domain name. If not found, it sends a DNS (Domain Name System) lookup request to a DNS server to obtain the IP address.

3. TCP Connection: The browser establishes a TCP (Transmission Control Protocol) connection with the web server at the obtained IP address. This involves a three-way handshake to establish a reliable connection.

4. HTTP Request: The browser sends an HTTP (Hypertext Transfer Protocol) request to the web server. The request includes the HTTP method (e.g., GET, POST), the requested resource (e.g., /index.html), headers (e.g., user-agent, cookies), and other optional data.

5. Server Processing: The web server receives the HTTP request and processes it. This may involve accessing databases, executing server-side code, or retrieving files from the server's file system.

6. HTTP Response: The web server generates an HTTP response containing the requested resource, along with headers and a status code. The response may also include cookies, caching directives, or other metadata.

7. Data Transfer: The server sends the HTTP response back to the browser over the established TCP connection. The response is divided into packets and transmitted over the network.

8. Browser Rendering: The browser receives the response and begins rendering the web page. It parses the HTML, executes any embedded scripts, and fetches additional resources like CSS stylesheets, images, or JavaScript files referenced in the HTML.

9. Page Display: The browser combines the received resources to render and display the web page to the user. This includes rendering text, applying styles, and executing JavaScript code to add interactivity.

Top 5 Kafka use cases

Database Types

type of different databases

HTTP/1.1, HTTP/2, and HTTP/3

5 Inter-Process Communications

The Code Review Pyramid

Linux file system explained

How GraphQL Works in Linkedln?

Things to consider when using cache

𝐒𝐮𝐢𝐭𝐚𝐛𝐥𝐞 𝐒𝐜𝐞𝐧𝐚𝐫𝐢𝐨𝐬:
- In-memory solution
- Read heavy system
- Data is not frequently updated

𝐂𝐚𝐜𝐡𝐢𝐧𝐠 𝐓𝐞𝐜𝐡𝐧𝐢𝐪𝐮𝐞𝐬:
- Cache aside
- Write-through
- Read-through
- Write-around
- Write-back

𝐂𝐚𝐜𝐡𝐞 𝐄𝐯𝐢𝐜𝐭𝐢𝐨𝐧 𝐀𝐥𝐠𝐨𝐫𝐢𝐭𝐡𝐦𝐬:
- Least Recently Used (LRU)
- Least Frequently Used (LFU)
- First-in First-out (FIFO)
- Random Replacement (RR)

𝐊𝐞𝐲 𝐌𝐞𝐭𝐫𝐢𝐜𝐬:
- Cache Hit Ratio
- Latency
- Throughput
- Invalidation Rate
- Memory Usage
- CPU usage
- Network usage

𝐎𝐭𝐡𝐞𝐫 𝐢𝐬𝐬𝐮𝐞𝐬:
- Thunder herd on cold start
- Time-to-live (TTL)

Process and Thread

which type of database to use

Online Ticketing Platform

The China Train tickets booking system has similar challenges as the Ticketmaster system:

Very high concurrent visits during peak hours.

The QPS for checking remaining tickets and orders is very high

A lot of bots

The solutions

Separate read and write requests. Because anxious users kept refreshing the web page to check if there were tickets available, the system could under huge pressure.

To handle the calculation and query in memory, the remaining ticket components were moved entirely to GemFire. It is possible to fit the entire country's train tickets into several Gigabytes of memory.
In addition, the order query component was moved to GemFire to reduce the load on the order database. Hadoop was used to store historical orders.

Leverage public cloud for elastic capacity.

Ban bots. It reduced the traffic by 95%.

Increase the bandwidth of the system.

Increase system availability by setting up more data centers in different cities.

Design multiple emergency plans.

How will you design the Stack Overflow website?

What people think it should look like
The interviewer is probably expecting something on the left side.

Microservice is used to decompose the system into small components.

Each service has its own database. Use cache heavily.

The service is sharded.

The services talk to each other asynchronously through message queues.

The service is implemented using Event Sourcing with CQRS.

Showing off knowledge in distributed systems such as eventual consistency, CAP theorem, etc.

What it actually is
Stack Overflow serves all the traffic with only 9 on-premise web servers, and it’s on monolith! It has its own servers and does not run on the cloud.

Secret Sauce Behind NoSQL: LSM Tree

Is HTTPs safe?

Design patterns cheat sheet

Design patterns cheat sheet2

API Platform Landscape

scan pay

Here are the steps for generating the QR code:

When you want to pay for your shopping, the cashier tallies up all the goods and calculates the total amount due, for example, $123.45. The checkout has an order ID of SN129803. The cashier clicks the “checkout” button.

The cashier’s computer sends the order ID and the amount to PSP.

The PSP saves this information to the database and generates a QR code URL.

PSP’s Payment Gateway service reads the QR code URL.

The payment gateway returns the QR code URL to the merchant’s computer.

The merchant’s computer sends the QR code URL (or image) to the checkout counter.

The checkout counter displays the QR code.

These 7 steps are completed in less than a second. Now it’s the consumer’s turn to pay from their digital wallet by scanning the QR code:

The consumer opens their digital wallet app to scan the QR code.

After confirming the amount is correct, the client clicks the “pay” button.

The digital wallet App notifies the PSP that the consumer has paid the given QR code.

The PSP payment gateway marks this QR code as paid and returns a success message to the consumer’s digital wallet App.

The PSP payment gateway notifies the merchant that the consumer has paid the given QR code.

Over to you: I have detailed how to pay using a dynamic QR code. It is dynamic because the QR code is dynamically generated each time. But sometimes, you could pay by scanning a printed QR code in a merchant’s shop, which is called the static QR code. Do you know how a static QR code works?

Linux file permission illustrated.

How to store passwords safely

avoid crawling duplicate URLs at Google scale

Option 1: Use a Set data structure to check if a URL already exists or not. Set is fast, but it is not space-efficient.

Option 2: Store URLs in a database and check if a new URL is in the database. This can work but the load to the database will be very high.

Option 3: 𝐁𝐥𝐨𝐨𝐦 𝐟𝐢𝐥𝐭𝐞𝐫. This option is preferred. Bloom filter was proposed by Burton Howard Bloom in 1970. It is a probabilistic data structure, that is used to test whether an element is a member of a set.

🔹 false: the element is definitely not in the set.

🔹 true: the element is probably in the set.

False-positive matches are possible, but false negatives are not.

The diagram below illustrates how the Bloom filter works. The basic data structure for the Bloom filter is Bit Vector. Each bit represents a hashed value.

scale a website to support millions of users

Step 1 - With the growth of the user base, one single application server cannot handle the traffic anymore. We put the application server and the database server into two separate servers.

Step 2 - The business continues to grow, and a single application server is no longer enough. So we deploy a cluster of application servers.

Step 3 - Now the incoming requests have to be routed to multiple application servers, how can we ensure each application server gets an even load? The load balancer handles this nicely.

Step 4 - With the business continuing to grow, the database might become the bottleneck. To mitigate this, we separate reads and writes in a way that frequent read queries go to read replicas. With this setup, the throughput for the database writes can be greatly increased.

Step 5 - Suppose the business continues to grow. One single database cannot handle the load on both the inventory table and user table. We have a few options:

1. Vertical scaling. Adding more power (CPU, RAM, etc.) to the database server. It has a hard limit.
2. Horizontal scaling by adding more database servers.
3. Adding a caching layer to offload read requests.

Step 6 - Now we can modularize the functions into different services. The architecture becomes service-oriented / microservice.

S3-like storage system

In S3, an object resides in a bucket. The path looks like this: /bucket-to-share/script.txt. The bucket only has metadata. The object has metadata and the actual data.

The diagram below (Figure 2) illustrates how file uploading works. In this example, we first create a bucket named “bucket-to-share” and then upload a file named “script.txt” to the bucket.

1. The client sends an HTTP PUT request to create a bucket named “bucket-to-share.” The request is forwarded to the API service.

2. The API service calls Identity and Access Management (IAM) to ensure the user is authorized and has WRITE permission.

3. The API service calls the metadata store to create an entry with the bucket info in the metadata database. Once the entry is created, a success message is returned to the client.

4. After the bucket is created, the client sends an HTTP PUT request to create an object named “script.txt”.

5. The API service verifies the user’s identity and ensures the user has WRITE permission on the bucket.

6. Once validation succeeds, the API service sends the object data in the HTTP PUT payload to the data store. The data store persists the payload as an object and returns the UUID of the object.

7. The API service calls the metadata store to create a new entry in the metadata database. It contains important metadata such as the object_id (UUID), bucket_id (which bucket the object belongs to), object_name, etc.

Match buy and sell stock orders

Stocks go up and down. Do you know what data structure is used to efficiently match buy and sell orders?

Stock exchanges use order books. An order book is an electronic list of buy and sell orders, organized by price levels. It has a buy book and a sell book, where each side of the book contains a bunch of price levels, and each price level contains a list of orders (first in first out).

The diagram below is an example of price levels and the queued quantity at each price level.

So what happens when you place a market order to buy 2700 shares in the diagram?

- The buy order is matched with all the sell orders at price 100.10, and the first order at price 100.11 (illustrated in light red).

- Now because of the big buy order which “eats up” the first price level on the sell book, the best ask price goes up from 100.10 to 100.11.

- So when the market is bullish, people tend to buy stocks aggressively, and the price goes up and up.

An efficient data structure for an order book must satisfy:

- Constant lookup time. Operations include: get volume at a price level or between price levels, query best bid/ask.

- Fast add/cancel/execute/update operations, preferably O(1) time complexity. Operations include: place a new order, cancel an order, and match an order.

Why is single-threaded Redis so fast?

1 in-memory database
2 io multiplexing a single thread to wait on many sockets
3 efficient data structure

Top 5 Redis Use Cases

Top 5 Most Used Architecture Patterns

Algorithms You Should Know

Top 7 Ways to 10x Your API Performance

Git MERGE vs REBASE: Everything You Need to Know

SSL, TLS, HTTPS Explained

1. Client Hello: The client initiates a secure connection by sending a "Client Hello" message to the server. This message includes information about the client's supported encryption algorithms and other parameters.

2. Server Hello: The server responds with a "Server Hello" message, selecting the strongest encryption algorithm and other parameters that both the client and server support.

3. Certificate Exchange: The server sends its digital certificate to the client. The certificate contains the server's public key, which is used for encryption and authentication.

4. Certificate Validation: The client verifies the authenticity of the server's certificate. It checks if the certificate is issued by a trusted Certificate Authority (CA) and if it has not expired or been revoked.

5. Key Exchange: The client generates a random symmetric encryption key and encrypts it using the server's public key. This encrypted key is sent to the server.

6. Session Key: The server decrypts the encrypted key using its private key and obtains the symmetric encryption key. Both the client and server now have the same session key for secure communication.

7. Secure Communication: The client and server use the session key to encrypt and decrypt data exchanged between them. This ensures that the data transmitted over the network is protected from eavesdropping and tampering.

8. HTTPS Connection: From this point onwards, the client and server communicate using the secure HTTPS protocol. All HTTP requests and responses are encrypted and decrypted using the session key.

What is API Gateway?

How to Choose a Replication Strategy

1.leader-follower model

2.multi-leader
Multi-leader replication provides high availability but requires careful design around consensus, conflict detection, and resolution mechanisms.
The primary advantage of this model is increased write availability
disadvantage: Managing Conflict:
2.1Last Write Wins
2.2Conflict-free Replicated Data Types (CRDTs)
CRDTs allow for seamless reconciliation of conflicting changes by merging them.
2.3Operational Transformation
Operational transformation is often used in real-time collaborative applications. It takes the operation itself into account, not just the state of the data.
2.4Application-specific Resolution

3.leaderless replication
Quorum Writes and Reads
In this system, we use three important values.
'n' is the total number of nodes in our system.
'W', the write quorum, is the minimum number of nodes that need to agree for a write to be considered successful.
'r', the read quorum, is the minimum number of nodes that need to agree for a read to be valid.
For strong consistency, a general guideline is to have w + r > n. It ensures that any read overlaps with any write and returns the most recent value.

HTTPS, SSL Handshake, and Data Encryption

CI/CD Pipeline Explained in Simple Terms

HTTP verbs

Status Codes

SQL Execution Order

Common Rate Limiting Algorithms

Fixed Window Counter

Sliding Window Log

Sliding Window Counter

Token Bucket

Leaky Bucket

db selection

8 Data Structures in Databases

2-factor authenticators

How does REST API work?

OSI model

proxy

A forward proxy is a server that sits between a group of client machines and the internet.

A reverse proxy sits between the internet and the web servers. It intercepts the requests from clients and talks to the web server on behalf of the clients.

Twitter architecture

iQIYI database selection trees

Five ways to generate distributed unique ID

Why is RESTful API so popular?

What is a File Descriptor?

A file descriptor represents an open file. It is a unique number assigned by the operating system to each file. It is an abstraction for working with files. We need to use file descriptors to read from or write to files in our program. Each process maintains its own file descriptor table.

The diagram below shows the layered architecture in Linux filesystem. Let’s take process 1234 as an example.

🔹 In User Space
When we open a file called “fileA.txt” in Process 1234, we get file descriptor fd1, which is equal to 3. We can then pass the file descriptor to other functions to write data to the file.

🔹 In Kernel Space
In Linux kernel, there is a process table to maintain the data for the processes. Each process has an entry in the table. Each process maintains a file descriptor table, with file descriptors as its indices. Notice that file descriptors 0,1 and 2 are reserved in each file descriptor table to represent stdin, stdout, and stderr.

The file pointer points to an entry in the open file table, which has information about open files across all processes. Multiple file descriptors can point to the same file table entry. For example, file descriptor 0, 1 and 2 points to the same open file table entry.

Since different open file table entries can represent the same file, it is a waste of resources to store the file static information so many times. We need another abstraction layer called ‘vnode table’ to store the static data.

In each file table entry, there is a vnode pointer, which points to an entry in vnode table. The static information includes file type, function pointers, reference counts, inode etc. inode describes a physical object in the filesystem.

🔹 In Filesystem
The inode array element stores the actual file information, including permission mode, owners, timestamps, etc. inode also points to the data blocks stored in the filesystem.

Over to you: When we close a file in a program, do you know which entries are deleted in these data structures?

Flash sale system

notifications pushed to our phones or PCs

How do we design for high availability

Inner, Left, Right, and Full join

Process and Thread

A Process means a program is in execution. When a program is loaded into the memory and becomes active, the program becomes a process. The process requires some essential resources such as registers, program counter, and stack.

A Thread is the smallest unit of execution within a process.

The following process explains the relationship between program, process, and thread.

1. The program contains a set of instructions.

2. The program is loaded into memory. It becomes one or more running processes.

3. When a process starts, it is assigned memory and resources. A process can have one or more threads. For example, in the Microsoft Word app, a thread might be responsible for spelling checking and the other thread for inserting text into the doc.

Main differences between process and thread:

🔹 Processes are usually independent, while threads exist as subsets of a process.

🔹 Each process has its own memory space. Threads that belong to the same process share the same memory.

🔹 A process is a heavyweight operation. It takes more time to create and terminate.

🔹 Context switching is more expensive between processes.

🔹 Inter-thread communication is faster for threads.

Redis vs Memcached

Worldwide Interbank Financial Telecommunication

Step 1: Bank A sends a message with transfer details to Regional Processor A in New York. The destination is Bank B.

Step 2: Regional processor validates the format and sends it to Slice Processor A. The Regional Processor is responsible for input message validation and output message queuing. The Slice Processor is responsible for storing and routing messages safely.

Step 3: Slice Processor A stores the message.

Step 4: Slice Processor A informs Regional Processor A the message is stored.

Step 5: Regional Processor A sends ACK/NAK to Bank A. ACK means a message will be sent to Bank B. NAK means the message will NOT be sent to Bank B.

Step 6: Slice Processor A sends the message to Regional Processor B in London.

Step 7: Regional Processor B stores the message temporarily.

Step 8: Regional Processor B assigns a unique ID MON (Message Output Number) to the message and sends it to Slice Processor B

Step 9: Slice Processor B validates MON.

Step 10: Slice Processor B authorizes Regional Processor B to send the message to Bank B.

Step 11: Regional Processor B sends the message to Bank B.

Step 12: Bank B receives the message and stores it.

Step 13: Bank B sends UAK/UNK to Regional Processor B. UAK (user positive acknowledgment) means Bank B received the message without error; UNK (user negative acknowledgment) means Bank B received checksum failure.

Step 14: Regional Processor B creates a report based on Bank B’s response, and sends it to Slice Processor B.

Step 15: Slice Processor B stores the report.

Step 16 - 17: Slice Processor B sends a copy of the report to Slice Processor A. Slice Processor A stores the report.

Low latency stock exchange

How does a modern stock exchange achieve microsecond latency? The principal is:

Do less on the critical path
- Fewer tasks on the critical path
- Less time on each task
- Fewer network hops
- Less disk usage

For the stock exchange, the critical path is:
- start: an order comes into the order manager
- mandatory risk checks
- the order gets matched and the execution is sent back
- end: the execution comes out of the order manager

Other non-critical tasks should be removed from the critical path.

We put together a design as shown in the diagram:

- deploy all the components in a single giant server (no containers)

- use shared memory as an event bus to communicate among the components, no hard disk

- key components like Order Manager and Matching Engine are single-threaded on the critical path, and each pinned to a CPU so that there is no context switch and no locks

- the single-threaded application loop executes tasks one by one in sequence

- other components listen on the event bus and react accordingly

Load Balance Algorithms

Top 5 Most-Used Deployment Strategies

Secret To Optimizing SQL Queries

To write sargable queries:
· Avoid using functions or calculations on indexed columns in the WHERE clause
· Use direct comparisons when possible, instead of wrapping the column in a function
· If we need to use a function on a column, consider creating a computed column or a function-based index, if the database system supports it

10 Key Data Structures We Use Every Day

What Are Microservices Really All About?

Bare Metal, Virtual Machines, and Containers

如何设计一个高并发系统?

1. 运用自动扩缩容，监控和报警
2. 扩展微服务架构，根据具体服务的需求进行独立扩展
3. 优化缓存策略
4. 确保负载均衡
5. 冗余和故障转移，数据中心故障
1. 水平扩展（Horizontal Scaling）：通过增加服务器实例来分散负载，使系统能够处理更多的并发请求。这可以通过负载均衡器来实现，将请求分发到多个服务器上。
2. 多级缓存（Caching）：使用缓存来存储频繁访问的数据，减少对后端数据库或其他资源的访问次数。常见的缓存技术包括内存缓存（如Redis）和分布式缓存（如Memcached）。
3. 异步处理（Asynchronous Processing）：将耗时的操作转化为异步任务，通过消息队列或任务队列来处理。这样可以减少请求的等待时间，提高系统的吞吐量。
4. 数据库优化：使用合适的数据库引擎和优化技术，如索引、分区、缓存查询结果等，以提高数据库的读写性能和并发处理能力。
5. 无状态设计（Stateless Design）：尽量设计无状态的系统，将状态信息存储在外部，如数据库或缓存中。这样可以使系统更容易水平扩展，并提高系统的可伸缩性。
6. 限流和熔断（Rate Limiting and Circuit Breaking）：通过限制请求的速率或在系统负载过高时断开部分请求，以保护系统免受过载的影响。
7. 分布式架构（Distributed Architecture）：将系统拆分为多个独立的服务，通过消息传递或API调用进行通信。这样可以提高系统的可扩展性和容错性。
8. 监控和自动化：建立监控系统来实时监测系统的性能和健康状况，及时发现并解决潜在的问题。自动化部署和扩展可以减少人工干预，提高系统的可靠性和可维护性。
9. 容错和恢复（Fault Tolerance and Recovery）：设计系统以容忍部分组件或服务的故障，并能够快速恢复。使用备份和冗余机制来保证系统的可用性。
10. 性能测试和优化：进行系统的性能测试和基准测试，找出瓶颈和性能瓶颈，并进行相应的优化和调整。

Top 5 Caching Strategies

There are 3 ways to use Redis as a message queue

Pub/Sub is convenient but has some delivery restrictions. The consumer subscribes to a key and receives the data when a producer publishes data to the same key. The restriction is that the data is delivered at most once. If a consumer was down and didn’t receive the published data, that data is lost. Also, the data is not persisted on disk. If Redis goes down, all Pub/Sub data is lost. Pub/Sub is suitable for metrics monitoring where some data loss is acceptable.

The List data structure in Redis can construct a FIFO (First-In-First-Out) queue. The consumer uses BLPOP to wait for messages in blocking mode, so a timeout should be applied. Consumers waiting on the same List form a consumer group where each message is consumed by only one consumer. As a Redis data structure, List can be persisted to disk.

Stream solves the restrictions of the above two methods. Consumers choose where to read messages from - “$” for new messages, “<id>” for a specific message id, or “0-0” for reading from the start.

Benefits of Message Queues

Fan-out

Asynchronous Processing

Rate Limiting

Decoupling

Horizontal Scalability

Message Persistence

Batch Processing

Message Ordering

A comparison between B-Tree vs B+ Tree

9 best practices for developing microservices

data warehouse and a data lake

The 10 Algorithms That Dominate Our World

18 Most-used Linux Commands You Should Know

Buy Now, Pay Later

microservices tech stack

user identity management

WWW-Authenticate is the most basic method. You are asked for the username and password by the browser. As a result of the inability to control the login life cycle, it is seldom used today.

A finer control over the login life cycle is session-cookie. The server maintains session storage, and the browser keeps the ID of the session. A cookie usually only works with browsers and is not mobile app friendly.

To address the compatibility issue, the token can be used. The client sends the token to the server, and the server validates the token. The downside is that the token needs to be encrypted and decrypted, which may be time-consuming.

JWT is a standard way of representing tokens. This information can be verified and trusted because it is digitally signed. Since JWT contains the signature, there is no need to save session information on the server side.

By using SSO (single sign-on), you can sign on only once and log in to multiple websites. It uses CAS (central authentication service) to maintain cross-site information

By using OAuth 2.0, you can authorize one website to access your information on another website

B-Tree vs. LSM-Tree

B-Tree
B-Tree is the most widely used indexing data structure in almost all relational databases.

The basic unit of information storage in B-Tree is usually called a “page”. Looking up a key traces down the range of keys until the actual value is found.

LSM-Tree
LSM-Tree (Log-Structured Merge Tree) is widely used by many NoSQL databases, such as Cassandra, LevelDB, and RocksDB.

LSM-trees maintain key-value pairs and are persisted to disk using a Sorted Strings Table (SSTable), in which the keys are sorted.

Level 0 segments are periodically merged into Level 1 segments. This process is called compaction.

The biggest difference is probably this:

B-Tree enables faster reads

LSM-Tree enables fast writes

Resiliency Patterns

What is GraphQL

How is SQL executed?

scan to pay

Designing a location-based service

how to make your website rank higher?

A search engine works in 3 stages:

The crawler reads the page content (HTML code) and follows the hyperlink to read more web pages.

The preprocessor also works in 3 steps:

It removes HTML tags and ‘Stop’ words, which are words like ‘a’ or ‘an’ or ‘the.’ It also removes other noise that is not relevant to the web page's content, for example, the disclaimer.

Then the keywords form structured indices, called forward indices and inverted indices.

The preprocessor calculates the hyperlink relationships, for example, how many hyperlinks are on the web page and how many hyperlinks point to it.

When a user types in a search term, the search engine uses the indices and ranking algorithms to rank the web pages and presents the search results to the user.

How do we make our website rank higher in search results? The diagram below shows some ways to do this.

Optimize website structure:

We need to make it easier for the crawler to crawl our website. Remove anything the crawler cannot read, including flash, frames, and dynamic URLs. Make the website hierarchy less deep, so the web pages are less distant from the main home page.

The URLs must be short and descriptive. Try to include keywords in the URLs, as well. It will also help to use HTTPS. But don’t use underscore in the URL because that will screw up the tokenization.

Choose the keywords to optimize for:

Keywords must be relevant to what the website is selling, and they must have business values. For example, a keyword is considered valuable if it’s a popular search, but has fewer search results.

Optimize the web page

The crawler crawls the HTML contents. Therefore the title and description should be optimized to include keywords and be concise. The body of the web page should include relevant keywords.

Another aspect is the user experience. In May 2020, Google published Core Web Vitals, officially listing user experience as an important factor of page ranking algorithms.

External link

If our website is referenced by a highly-ranked website, it will increase our website’s ranking. So carefully building external links is important. Publishing high-quality content on your website which is useful to other users, is a good way to attract external links.

Over to you: What are your top SEO recommendations?

Single Sign-On

How to design a secure web API access

Token based

Step 1 - the user enters their password into the client, and the client sends the password to the Authentication Server.

Step 2 - the Authentication Server authenticates the credentials and generates a token with an expiry time.

Steps 3 and 4 - now the client can send requests to access server resources with the token in the HTTP header. This access is valid until the token expires.

HMAC based

This mechanism generates a Message Authentication Code (signature) by using a hash function (SHA256 or MD5).

Steps 1 and 2 - the server generates two keys, one is Public APP ID (public key) and the other one is API Key (private key).

Step 3 - we now generate a HMAC signature on the client side (hmac A). This signature is generated with a set of attributes listed in the diagram.

Step 4 - the client sends requests to access server resources with hmac A in the HTTP header.

Step 5 - the server receives the request which contains the request data and the authentication header. It extracts the necessary attributes from the request and uses the API key that’s stored on the server side to generate a signature (hmac B.)

Steps 6 and 7 - the server compares hmac A (generated on the client side) and hmac B (generated on the server side). If they are matched, the requested resource will be returned to the client.

Question - How does HMAC authentication ensure data integrity? Why do we include “request timestamp” in HMAC signature generation?

top caching strategies?

Vertical partitioning vs horizontal partitioning

Proximity service

- Business Service
- Add/delete/update restaurant information
- Customers view restaurant details
- Location-based Service
- Given a radius and location, return a list of nearby restaurants

How are the restaurant locations stored in the database so that LBS can return nearby restaurants efficiently?

Store the latitude and longitude of restaurants in the database? The query will be very inefficient when you need to calculate the distance between you and every restaurant.

One way to speed up the search is using the geohash algorithm.

First, divide the planet into four quadrants along with the prime meridian and equator：

- Latitude range [-90, 0] is represented by 0

- Latitude range [0, 90] is represented by 1

- Longitude range [-180, 0] is represented by 0

- Longitude range [0, 180] is represented by 1

Second, divide each grid into four smaller grids. Each grid can be represented by alternating between longitude bit and latitude bit.

So when you want to search for the nearby restaurants in the red-highlighted grid, you can write SQL like:

SELECT * FROM geohash_index WHERE geohash LIKE

01%

Geohash has some limitations. There can be a lot of restaurants in one grid (downtown New York), but none in another grid (ocean). So there are other more complicated algorithms to optimize the process. Let me know if you are interested in the details.

Black Friday flash sale

Design principles:

1. Less is more - less element on the web page, fewer data queries to the database, fewer web requests, fewer system dependencies

2. Short critical path - fewer hops among services or merge into one service

3. Async processing- use message queues to handle high TPS

4. Isolation - isolate static and dynamic contents, isolate processes and databases for rare items

5. Overselling is bad. When to decrease the inventory is important

6. User experience is important. We definitely don’t want to inform users that they have successfully placed orders but later tell them no items are actually available

Happy shopping! If I missed anything, please leave a comment.

API architecture styles

System Design Blueprint: The Ultimate Guide

Data Structures Used in Daily Life

How does Twitter Recommend "For You" Timeline?

Architecture Characteristics

How do we design effective and safe APIs?

How to Learn SQL?

Managing Operational Challenges in Caching

Cache stampede雪崩

自动续期，分布式锁，Staggered expiration，Consistent hashing，Circuit breakers，Request rate limiting and load shedding

Cache Penetration 穿透

store a placeholder value in the cache to represent non-existent data， bloom filter

Hot key problem

conduct real-time traffic analysis to promptly detect emerging hot keys，split hot key into small keys and distribute, real-time monitoring can enable the cache system to expand quickly, store hot keys in a local cache, decreasing traffic to the remote cache system.

Large key problem

1. Data Sharding 2. Hashing or Digest 3. External Storage 4. Data Compression 5. Database or Document Store

Managing Operational Challenges in Caching

Cache stampede雪崩

Staggered expiration，Consistent hashing，Circuit breakers，Request rate limiting and load shedding

Cache Penetration 穿透

store a placeholder value in the cache to represent non-existent data， bloom filter

Hot key problem

Large key problem

1. Data Sharding 2. Hashing or Digest 3. External Storage 4. Data Compression 5. Database or Document Store

DevOps vs. SRE vs. Platform Engineering

A Crash Course in Caching

What are the API architectural styles?

REST
Proposed in 2000, REST is the most used style. It is often used between front-end clients and back-end services. It is compliant with 6 architectural constraints. The payload format can be JSON, XML, HTML, or plain text.

GraphQL
GraphQL was proposed in 2015 by Meta. It provides a schema and type system, suitable for complex systems where the relationships between entities are graph-like. For example, in the diagram below, GraphQL can retrieve user and order information in one call, while in REST this needs multiple calls.

GraphQL is not a replacement for REST. It can be built upon existing REST services.

Web socket
Web socket is a protocol that provides full-duplex communications over TCP. The clients establish web sockets to receive real-time updates from the back-end services. Unlike REST, which always “pulls” data, web socket enables data to be “pushed”.

Webhook
Webhooks are usually used by third-party asynchronous API calls. In the diagram below, for example, we use Stripe or Paypal for payment channels and register a webhook for payment results. When a third-party payment service is done, it notifies the payment service if the payment is successful or failed. Webhook calls are usually part of the system’s state machine.

gRPC
Released in 2016, gRPC is used for communications among microservices. gRPC library handles encoding/decoding and data transmission.

SOAP
SOAP stands for Simple Object Access Protocol. Its payload is XML only, suitable for communications between internal systems.

How Git commands work

a load balancer and an API gateway

Llama 5.0

Why is Redis so fast?

design a chat application

log parsing

API architectural styles

How to scale from 0 to millions of users

Load balancer
A load balancer evenly distributes incoming traffic among web servers that are defined in a load-balanced set.

Web servers
Web server returns HTML pages or JSON response for rendering.

Databases: vertical scaling and horizontal scaling

Cache
A cache is a temporary storage area that stores the result of expensive responses or frequently accessed data in memory so that subsequent requests are served more quickly.

CDN
A CDN is a network of geographically dispersed servers used to deliver static content. CDN servers cache static content like images, videos, CSS, JavaScript files, etc.

Message queue
A message queue is a durable component, stored in memory, that supports asynchronous communication.

Logging, metrics, automation
When working with a small website that runs on a few servers, logging, metrics, and automation support are good practices but not a necessity. However, now that your site has grown to serve a large business, investing in those tools is essential.

CAP theorem

Capacity Planning

E-commerce Workflow

Live streaming explained

Step 1: The raw video data is captured by a microphone and camera. The data is sent to the server side.

Step 2: The video data is compressed and encoded. For example, the compressing algorithm separates the background and other video elements. After compression, the video is encoded to standards such as H.264. The size of the video data is much smaller after this step.

Step 3: The encoded data is divided into smaller segments, usually seconds in length, so it takes much less time to download or stream.

Step 4: The segmented data is sent to the streaming server. The streaming server needs to support different devices and network conditions. This is called ‘Adaptive Bitrate Streaming.’ This means we need to produce multiple files at different bitrates in steps 2 and 3.

Step 5: The live streaming data is pushed to edge servers supported by CDN (Content Delivery Network.) Millions of viewers can watch the video from an edge server nearby. CDN significantly lowers data transmission latency.

Step 6: The viewers’ devices decode and decompress the video data and play the video in a video player.

Steps 7 and 8: If the video needs to be stored for replay, the encoded data is sent to a storage server, and viewers can request a replay from it later.

Standard protocols for live streaming include:

RTMP (Real-Time Messaging Protocol): This was originally developed by Macromedia to transmit data between a Flash player and a server. Now it is used for streaming video data over the internet. Note that video conferencing applications like Skype use RTC (Real-Time Communication) protocol for lower latency.

HLS (HTTP Live Streaming): It requires the H.264 or H.265 encoding. Apple devices accept only HLS format.

DASH (Dynamic Adaptive Streaming over HTTP): DASH does not support Apple devices.
Both HLS and DASH support adaptive bitrate streaming.

Over to you: What are some of the optimizations that can be done in this process? Which type of storage is suitable for video persistence in step 7

Algorithms you should know

non-blocking queue

ELK Stack

database isolation levels

🔹Serializalble: This is the highest isolation level. Concurrent transactions are guaranteed to be executed in sequence.

🔹Repeatable Read: Data read during the transaction stays the same as the transaction starts.

🔹Read Committed: Data modification can only be read after the transaction is committed.

🔹Read Uncommitted: The data modification can be read by other transactions before a transaction is committed.

The isolation is guaranteed by MVCC (Multi-Version Consistency Control) and locks.

The diagram below takes Repeatable Read as an example to demonstrate how MVCC works:

There are two hidden columns for each row: transaction_id and roll_pointer. When transaction A starts, a new Read View with transaction_id=201 is created. Shortly afterward, transaction B starts, and a new Read View with transaction_id=202 is created.

Now transaction A modifies the balance to 200, a new row of the log is created, and the roll_pointer points to the old row. Before transaction A commits, transaction B reads the balance data. Transaction B finds that transaction_id 201 is not committed, it reads the next committed record(transaction_id=200).

Even when transaction A commits, transaction B still reads data based on the Read View created when transaction B starts. So transaction B always reads the data with balance=100.

Why is a solid-state drive (SSD) fast?

Step 1: “Commands come from the user through the host interface” [2]. The interface can be Serial ATA (SATA) or PCI Express (PCIe).

Step 2: “The processor in the SSD controller takes the commands and passes them to the flash controller” [2].

Step 3: “SSDs also have embedded RAM memory, generally for caching purposes and to store mapping information” [2].

Step 4: “The packages of NAND flash memory are organized in gangs, over multiple channels” [2].

The second diagram illustrates how the logical and physical pages are mapped, and why this architecture is fast.

SSD controller operates multiple FLASH particles in parallel, greatly improving the underlying bandwidth. When we need to write more than one page, the SSD controller can write them in parallel [3], whereas the HDD has a single head and it can only read from one head at a time.

Every time a HOST Page is written, the SSD controller finds a Physical Page to write the data and this mapping is recorded. With this mapping, the next time HOST reads a HOST Page, the SSD knows where to read the data from FLASH [3].

At most once, at least once, exactly once

𝐀𝐭-𝐦𝐨𝐬𝐭 𝐨𝐧𝐜𝐞
As the name suggests, at-most once means a message will be delivered not more than once. Messages may be lost but are not redelivered. This is how at-most once delivery works at the high level.

Use cases: It is suitable for use cases like monitoring metrics, where a small amount of data loss is acceptable.

𝐀𝐭-𝐥𝐞𝐚𝐬𝐭 𝐨𝐧𝐜𝐞
With this data delivery semantic, it’s acceptable to deliver a message more than once, but no message should be lost.

Use cases: With at-least once, messages won’t be lost but the same message might be delivered multiple times. While not ideal from a user perspective, at-least once delivery semantics are usually good enough for use cases where data duplication is not a big issue or deduplication is possible on the consumer side. For example, with a unique key in each message, a message can be rejected when writing duplicate data to the database.

𝐄𝐱𝐚𝐜𝐭𝐥𝐲 𝐨𝐧𝐜𝐞
Exactly once is the most difficult delivery semantic to implement. It is friendly to users, but it has a high cost for the system’s performance and complexity.

Use cases: Financial-related use cases (payment, trading, accounting, etc.). Exactly once is especially important when duplication is not acceptable and the downstream service or third party doesn’t support idempotency.

Question: what is the difference between message queues vs event streaming platforms such as Kafka, Apache Pulsar, etc?

ByteByteGo - Youtube Cheat Sheet (DRAFT) by woshiamiaojiang

Kafka is fast

System Design Interview: A Step-B­y-Step Guide

CI/CD In 5 Minutes

Cache Systems Every Developer Should Know

type a URL into your browser

type a URL into your browser?

Top 5 Kafka use cases

Database Types

type of different databases

HTTP/1.1, HTTP/2, and HTTP/3

5 Inter-­Process Commun­ica­tions

The Code Review Pyramid

Linux file system explained

How GraphQL Works in Linkedln?

Things to consider when using cache

Process and Thread

which type of database to use

Online Ticketing Platform

How will you design the Stack Overflow website?

Secret Sauce Behind NoSQL: LSM Tree

Is HTTPs safe?

Design patterns cheat sheet

Design patterns cheat sheet2

API Platform Landscape

scan pay

Linux file permission illust­rated.

How to store passwords safely

avoid crawling duplicate URLs at Google scale

scale a website to support millions of users

S3-like storage system

Match buy and sell stock orders

Why is single­-th­readed Redis so fast?

Top 5 Redis Use Cases

Top 5 Most Used Archit­ecture Patterns

Algorithms You Should Know

Top 7 Ways to 10x Your API Perfor­mance

Git MERGE vs REBASE: Everything You Need to Know

SSL, TLS, HTTPS Explained

SSL, TLS, HTTPS Explained

What is API Gateway?

How to Choose a Replic­ation Strategy

HTTPS, SSL Handshake, and Data Encryption

CI/CD Pipeline Explained in Simple Terms

HTTP verbs

Status Codes

SQL Execution Order

Common Rate Limiting Algorithms

db selection

8 Data Structures in Databases

2-factor authen­tic­ators

How does REST API work?

OSI model

proxy

Twitter archit­ecture

iQIYI database selection trees

Five ways to generate distri­buted unique ID

Why is RESTful API so popular?

What is a File Descri­ptor?

Flash sale system

notifi­cations pushed to our phones or PCs

How do we design for high availa­bility

Inner, Left, Right, and Full join

Process and Thread

Redis vs Memcached

Worldwide Interbank Financial Teleco­mmu­nic­ation

Low latency stock exchange

Load Balance Algorithms

Top 5 Most-Used Deployment Strategies

Secret To Optimizing SQL Queries

10 Key Data Structures We Use Every Day

What Are Micros­ervices Really All About?

Bare Metal, Virtual Machines, and Containers

如何设计一个­高并发系统?

Top 5 Caching Strategies

There are 3 ways to use Redis as a message queue

There are 3 ways to use Redis as a message queue

Benefits of Message Queues

A comparison between B-Tree vs B+ Tree

9 best practices for developing micros­ervices

System Design Interview: A Step-By-Step Guide

5 Inter-Process Communications

Linux file permission illustrated.

Why is single-threaded Redis so fast?

Top 5 Most Used Architecture Patterns

Top 7 Ways to 10x Your API Performance

How to Choose a Replication Strategy

2-factor authenticators

Twitter architecture

Five ways to generate distributed unique ID

What is a File Descriptor?

notifications pushed to our phones or PCs

How do we design for high availability

Worldwide Interbank Financial Telecommunication

What Are Microservices Really All About?

如何设计一个高并发系统?

9 best practices for developing microservices

microservices tech stack

Designing a location-based service

top caching strategies?

Vertical partitioning vs horizontal partitioning

API architecture styles

Architecture Characteristics

Managing Operational Challenges in Caching

Managing Operational Challenges in Caching

DevOps vs. SRE vs. Platform Engineering

What are the API architectural styles?

design a chat application

API architectural styles

non-blocking queue

Why is a solid-state drive (SSD) fast?