Topics 18
Last updated
Last updated
The diagram below shows the Git workflow.
Git is a distributed version control system. Every developer maintains a local copy of the main repository and edits and commits to the local copy. The commit is very fast because the operation doesn’t interact with the remote repository. If the remote repository crashes, the files can be recovered from the local repositories.
The diagram below shows why real-time gaming and low-latency trading applications should not use microservice architecture.
There are some common features of these applications, which make them choose monolithic architecture:
These applications are very latency-sensitive. For real-time gaming, the latency should be at the milli-second level; for low-latency trading, the latency should be at the micro-second level. We cannot separate the services into different processes because the network latency is unbearable.
Microservice architecture is usually stateless, and the states are persisted in the database. Real-time gaming and low-latency trading need to store the states in memory for quick updates. For example, when a character is injured in a game, we don’t want to see the update 3 seconds later. This kind of user experience can kill a game.
Real-time gaming and low-latency trading need to talk to the server in high frequency, and the requests need to go to the same running instance. So web socket connections and sticky routing are needed.
So microservice architecture is designed to solve problems for certain domains. We need to think about “why” when designing applications.
What is web assembly (WASM)? Why does it attract so much attention?
The diagram shows how we can run native C/C++/Rust code inside a web browser with WASM.
Traditionally, we can only work with Javascript in the web browser, and the performance cannot compare with native code like C/C++ because it is interpreted.
However, with WASM, we can reuse existing native code libraries developed in C/C++/Rust, etc to run in the web browser. These web applications have near-native performance.
For example, we can run the video encoding/decoding library (written in C++) in the web browser.
This opens a lot of possibilities for cloud computing and edge computing. We can run serverless applications with fewer resources and instant startup time.
A dispute happens when a cardholder disagrees with a merchant’s charge. A chargeback is a process of reversing the charge. Sometimes, the two terms are used interchangeably.
A dispute is expensive: for every dollar in disputed transactions, an additional $1.50 is spent on fees and expenses.
Steps 1-3: The cardholder, Bob raises a dispute with the card issuer. The issuing bank reviews details. In cases of legitimate disputes, the issuing bank submits a chargeback request to the card network.
Steps 4-6: The card network sends the dispute to the acquiring bank. After reviewing the details, the acquiring bank might ask the merchant to resolve the issue.
Steps 7,8: The merchant has two options:
Merchants can accept chargebacks if they appear legitimate.
The merchant can represent to the issuer the document that supports the transaction.
Steps 9-11: The acquiring bank reviews the evidence and represents the transaction to the card network, which forwards it to the issuer.
Steps 12-14: The issuer reviews the representment. There are two options:
The issuer charges the transaction back to the cardholder;
The issuer submits the dispute to the card network for arbitration.
Step 15: The card network rules based on the evidence and assigns the final liability to either the cardholder or the merchant.
The diagram below shows several common deployment strategies.
Big Bang Deployment Big Bang Deployment is quite straightforward, where we just roll out a new version in one go with service downtime. Preparation is essential for this strategy. We roll back to the previous version if the deployment fails. 💡 No downtime ❌ 💡 Targeted users ❌
Rolling Deployment Rolling Deployment applies phased deployment compared with big bang deployment. The whole plant is upgraded one by one over a period of time. 💡 No downtime ✅ 💡 Targeted users ❌
Blue-Green Deployment In blue-green deployment, two environments are deployed in production simultaneously. The QA team performs various tests on the green environment. Once the green environment passes the tests, the load balancer switches users to it. 💡 No downtime ✅ 💡 Targeted users ❌
Canary Deployment With canary deployment, only a small portion of instances are upgraded with the new version, once all the tests pass, a portion of users are routed to canary instances. 💡 No downtime ✅ 💡 Targeted users ❌
Feature Toggle With feature toggle, A small portion of users with a specific flag go through the code of the new feature, while other users go through normal code. This can be used in combination of other strategies: either the new branch of code is upgraded in one go, or only a few instances are upgraded with new code. 💡 No downtime ✅ 💡 Targeted users ✅
The diagram below shows a design for a simplified 1-to-1 chat application.
User Login Flow
Step 1: Alice logs in to the chat application and establishes a web socket connection with the server side.
Steps 2-4: The presence service receives Alice's notification, updates her presence, and notifies Alice's friends about her presence.
Messaging Flow
Steps 1-2: Alice sends a chat message to Bob. The chat message is routed to Chat Service A.
Steps 3-4: The chat message is sent to the sequencing service, which generates a unique ID, and is persisted in the message store.
Step 5: The chat message is sent to the message sync queue to sync to Bob’s chat service.
Step 6: Before forwarding the messaging, the message sync service checks Bob’s presence: a) If Bob is online, the chat message is sent to chat service B. b) If Bob is offline, the message is sent to the push server and pushed to Bob’s device.
Steps 7-8: If Bob is online, the chat message is pushed to Bob via the web socket
The answer will vary depending on your use case. Data can be indexed in memory or on disk. Similarly, data formats vary, such as numbers, strings, geographic coordinates, etc. The system might be write-heavy or read-heavy. All of these factors affect your choice of database index format.
The following are some of the most popular data structures used for indexing data:
Skiplist: a common in-memory index type. Used in Redis
Hash index: a very common implementation of the “Map” data structure (or “Collection”)
SSTable: immutable on-disk “Map” implementation
LSM tree: Skiplist + SSTable. High write throughput
B-tree: disk-based solution. Consistent read/write performance
Inverted index: used for document indexing. Used in Lucene
Suffix tree: for string pattern search
R-tree: multi-dimension search, such as finding the nearest neighbor
This is not an exhaustive list of all database index types. Over to you:
Which one have you used and for what purpose?
There is another one called “reverse index”. Do you know the difference between “reverse index” and “inverted index”?
When we merge changes from one Git branch to another, we can use ‘git merge’ or ‘git rebase’. The diagram below shows how the two commands work.
Git Merge This creates a new commit G’ in the main branch. G’ ties the histories of both main and feature branches.
Git merge is non-destructive. Neither the main nor the feature branch is changed.
Git Rebase Git rebase moves the feature branch histories to the head of the main branch. It creates new commits E’, F’, and G’ for each commit in the feature branch.
The benefit of rebase is that it has linear commit history.
Rebase can be dangerous if “the golden rule of git rebase” is not followed.
The Golden Rule of Git Rebase: Never use it on public branches!
The growth of BNPL has been dramatic in recent years. The BNPL provider represents the primary interface between the merchants and the customers for both eCommerce and POS (Point of Sale).
The diagram below shows how the process works:
Step 0. Bob registers with AfterPay. An approved credit/debit card is linked to this account.
Step 1. The "Buy Now, Pay Later" payment option is chosen by Bob when he wants to purchase a $100 product.
Steps 2-3. Bob's credit score is checked by the BNPL provider, and the transaction is approved.
Steps 4-5. A BNPL provider grants Bob a $100 consumer loan, which is usually financed by a bank. A total of $96 out of $100 is paid to the merchant immediately (yes, the merchant receives less with BNPL than with credit cards!) Bob must now pay the BNPL provider according to the payment schedule.
Step 6-8. Bob now pays the $25 down payment to BNPL. Stripe processes the payment transaction. It is then forwarded to the card network by Stripe. The card network must be paid an interchange fee since this goes through them as well.
Step 9. Bob can now receive the product since it has been released.
Steps 10-11. The BNPL provider receives installment payments from Bob every two weeks. Payment gateways process installments by deducting them from credit/debit cards.
Since OpenAI hasn't provided all the details, some parts of the diagram may be inaccurate.
We attempted to explain how it works in the diagram below. The process can be broken down into two parts.
Training. To train a ChatGPT model, there are two stages:
Pre-training: In this stage, we train a GPT model (decoder-only transformer) on a large chunk of internet data. The objective is to train a model that can predict future words given a sentence in a way that is grammatically correct and semantically meaningful similar to the internet data. After the pre-training stage, the model can complete given sentences, but it is not capable of responding to questions.
Fine-tuning: This stage is a 3-step process that turns the pre-trained model into a question-answering ChatGPT model:
1). Collect training data (questions and answers), and fine-tune the pre-trained model on this data. The model takes a question as input and learns to generate an answer similar to the training data. 2). Collect more data (question, several answers) and train a reward model to rank these answers from most relevant to least relevant. 3). Use reinforcement learning (PPO optimization) to fine-tune the model so the model's answers are more accurate.
Answer a prompt
Step 1: The user enters the full question, “Explain how a classification algorithm works”.
Step 2: The question is sent to a content moderation component. This component ensures that the question does not violate safety guidelines and filters inappropriate questions.
Steps 3-4: If the input passes content moderation, it is sent to the chatGPT model. If the input doesn’t pass content moderation, it goes straight to template response generation.
Step 5-6: Once the model generates the response, it is sent to a content moderation component again. This ensures the generated response is safe, harmless, unbiased, etc.
Step 7: If the input passes content moderation, it is shown to the user. If the input doesn’t pass content moderation, it goes to template response generation and shows a template answer to the user.
Let’s look at this question in a longer time range to see what the cloud really brings us.
When a company or a business line initially starts, product-market fit (PMF) is key. The cloud enables quick setup to run the system with minimal necessary hardware. The cost is also transparent.
For example, if we run the databases on-premise, we need to take care of hardware setup, operating system installation, DBMS maintenance, etc. But if we use Amazon RDS (Relational Database Service), we just need to take care of application optimization. This saves us the trouble of hiring Linux admins and DB admins.
Later, if the business model doesn’t work, we can just stop using the services to save costs without thinking about how to deal with the hardware.
In research conducted by Cameron Fisher, the cloud starts from almost zero cost. Over time, the cost starts to accumulate on subscriptions and deployment consulting. Ironically, because it is so easy to allocate services to the cloud for scalability or reliability reasons, an organization tends to overuse the cloud after adopting the cloud. It is essential to set up a monitoring framework for cost transparency.
👉 Over to you: Which notable companies use on-premise solutions and why?
Reference:
AWS guide: Choosing between Amazon EC2 and Amazon RDS
Cloud versus On-Premise Computing by Cameron Fisher, MIT
In 1998, Amazon's system architecture looked like this. The simplicity of the architecture is amazing.