Topics 16
Last updated
Last updated
The diagram below shows the differences between code-first development and API-first development. Why do we want to consider API first design?
Microservices increase system complexity.
We have separate services to serve different functions of the system. While this kind of architecture facilitates decoupling and segregation of duty, we need to handle the various communications among services.
It is better to think through the system's complexity before writing the code and carefully defining the boundaries of the services.
Separate functional teams need to speak the same language.
The dedicated functional teams are only responsible for their own components and services. It is recommended that the organization speak the same language via API design.
We can mock requests and responses to validate the API design before writing code.
Improve software quality and developer productivity
Since we have ironed out most of the uncertainties when the project starts, the overall development process is smoother, and the software quality is greatly improved.
Developers are happy about the process as well because they can focus on functional development instead of negotiating sudden changes.
The possibility of having surprises toward the end of the project lifecycle is reduced.
Because we have designed the API first, the tests can be designed while the code is being developed. In a way, we also have TDD (Test Driven Design) when using API first development.
The diagram below shows how the “ping” command works.
The “ping” command runs on ICMP (Internet Control Message Protocol), which is a network layer protocol.
There are 6 common types of messages in ICMP. For the ping command, we mainly use “echo request” and “echo reply”.
Host A sends an ICMP echo request (type = 8) with a sequence number 1. The request is encapsulated with an IP header to specify the source and destination IP addresses.
When host B receives the data, it sends back an ICMP echo reply (type = 0) with sequence number 1 to host A.
When host A receives the echo reply, it correlates the request and reply with the sequence number, and uses T1 and T2 to calculate the round trip time. That’s how we see the ping statistics.
Since RPC has become a hot topic, let's briefly review its history.
The diagram below illustrates the API timeline and API style comparison.
Over time, different API styles are released. Each of them has its own patterns of standardizing data exchange.
Session, cookie, JWT, token, SSO, and OAuth 2.0 - what are they?
These terms are all related to user identity management. When you log into a website, you declare who you are (identification). Your identity is verified (authentication), and you are granted the necessary permissions (authorization). Many solutions have been proposed in the past, and the list keeps growing.
From simple to complex, here is my understanding of user identity management:
WWW-Authenticate is the most basic method. You are asked for the username and password by the browser. As a result of the inability to control the login life cycle, it is seldom used today.
A finer control over the login life cycle is session-cookie. The server maintains session storage, and the browser keeps the ID of the session. A cookie usually only works with browsers and is not mobile app friendly.
To address the compatibility issue, the token can be used. The client sends the token to the server, and the server validates the token. The downside is that the token needs to be encrypted and decrypted, which may be time-consuming.
JWT is a standard way of representing tokens. This information can be verified and trusted because it is digitally signed. Since JWT contains the signature, there is no need to save session information on the server side.
By using SSO (single sign-on), you can sign on only once and log in to multiple websites. It uses CAS (central authentication service) to maintain cross-site information
By using OAuth 2.0, you can authorize one website to access your information on another website
Last week, Ticketmaster halted public ticket sales of Taylor Swift’s tour due to extraordinarily high demands on ticketing systems.
It’s an interesting problem, so we did some research on this topic. The diagram below shows the evolution of the online China Train ticket booking system.
The China Train tickets booking system has similar challenges as the Ticketmaster system:
Very high concurrent visits during peak hours.
The QPS for checking remaining tickets and orders is very high
A lot of bots
The solutions
Separate read and write requests. Because anxious users kept refreshing the web page to check if there were tickets available, the system could under huge pressure.
To handle the calculation and query in memory, the remaining ticket components were moved entirely to GemFire. It is possible to fit the entire country's train tickets into several Gigabytes of memory. In addition, the order query component was moved to GemFire to reduce the load on the order database. Hadoop was used to store historical orders.
Leverage public cloud for elastic capacity.
Ban bots. It reduced the traffic by 95%.
Increase the bandwidth of the system.
Increase system availability by setting up more data centers in different cities.
Design multiple emergency plans.
Note: the numbers are based on the back-of-the-envelope estimation (not official data).
What is a decentralized social network service?
The diagram below shows a comparison between Twitter and Mastodon.
It is said that Trump's new social media platform Truth Social was using the Mastodon.
Mastodon runs self-hosted social network services. It is free and has no ads. Its MAU (Monthly Active Users) increased from 500k in Oct to 1 million in Nov, after Elon Musk’s takeover of Twitter.
Unlike Twitter, whose servers belong to the Twitter company, Mastodon’s servers do not belong to any company. Its network is composed of servers (instances) from different organizations.
When users register, they must choose a server to start with. Since the servers sync up with each other, users can still receive updates from other servers.
Because the network is run by volunteers, the company has only one employee - its founder Eugen Rochko. It is run by crowdfunding and is now supported by 3500 people.
A remote procedure call (RPC) enables one machine to invoke some code on another machine as if it is a local function call from a user’s perspective.
gRPC is an open-source remote procedure call framework created by Google in 2016. What makes gRPC so popular?
First, gRPC has a thriving developer ecosystem. The core of this ecosystem is the use of Protocol Buffers as its data interchange format.
The second reason why gRPC is so popular is that it is high-performance out of the box.
The diagram compares monolithic and microservice architecture in the ideal world.
Suppose we have an eCommerce website that needs to handle the functions below:
User Management
Procurement Management
Order Management
Inventory Management
Payments
Logistics
In a monolithic architecture, all the components are deployed in one single instance. The service calls are within the same process, and no RPCs. The data tables relating to each component are usually deployed in the same database.
In a microservice architecture, each component becomes a self-contained service, maintained by a specialized team. The boundaries between services are clearly defined. The user interface talks to multiple services to get a workflow done. This is suitable for scaling out the business when the business has substantial growth.
However, since there are many more instances to maintain, microservice architecture needs quite some investment in DevOps.
At one point, microservice architecture was the golden standard as almost every large tech company moved from monolithic to microservices. But now, companies started to rethink the pros and cons of microservices. Some of the most controversial definitions of microservices are the exclusive use of a database & making 1000+ RPCs within a single client request.
k8s is a container orchestration system. It is used for container deployment and management. Its design is greatly impacted by Google’s internal system Borg.
A k8s cluster consists of a set of worker machines, called nodes, that run containerized applications. Every cluster has at least one worker node. [1]
The worker node(s) host the Pods that are the components of the application workload. The control plane manages the worker nodes and the Pods in the cluster. In production environments, the control plane usually runs across multiple computers and a cluster usually runs multiple nodes, providing fault tolerance and high availability. [1]
Control Plane Components
API ServerThe API server talks to all the components in the k8s cluster. All the operations on pods are executed by talking to the API server.
SchedulerThe scheduler watches the workloads on pods and assigns loads on newly created pods.
Controller ManagerThe controller manager runs the controllers, including Node Controller, Job Controller, EndpointSlice Controller, and ServiceAccount Controller.
etcd etcd is a key-value store used as Kubernetes' backing store for all cluster data.
Nodes
PodsA pod is a group of containers and is the smallest unit that k8s administers. Pods have a single IP address applied to every container within the pod.
KubeletAn agent that runs on each node in the cluster. It ensures containers are running in a Pod. [1]
Kube Proxykube-proxy is a network proxy that runs on each node in your cluster. It routes traffic coming into a node from the service. It forwards requests for work to the correct containers.
👉 Over to you: Do you know why Kubernetes is called “k8s”?
Reference [1]: kubernetes.io/docs/concepts/overview/components/
There are hundreds or even thousands of databases available today, such as Oracle, MySQL, MariaDB, SQLite, PostgreSQL, Redis, ClickHouse, MongoDB, S3, Ceph, etc. How do you select the architecture for your system? My short summary is as follows:
Relational database. Almost anything could be solved by them.
In-memory store. Their speed and limited data size make them ideal for fast operations.
Time-series database. Store and manage time-stamped data.
Graph database. It is suitable for complex relationships between unstructured objects.
Document store. They are good for large immutable data.
Wide column store. They are usually used for big data, analytics, reporting, etc., which needs denormalized data.