Topics 20
Last updated
Last updated
Let’s investigate the third trend a bit more.
In recent years, there has been a significant shift in the way web applications are built. Many of today’s popular applications are what is called a Single Page Application (SPA).
A SPA provides a more seamless user experience by dynamically updating the current page instead of loading a new one each time the user interacts with the application. In a SPA, the initial HTML and its resources are loaded once, and subsequent interactions with the application are handled using JavaScript to manipulate the existing page content.
The traditional way of building web applications, like the one we discussed previously for our e-commerce company Llama, involved serving up new HTML pages from the server every time the user clicked on a link or submitted a form. This conventional model is referred to as a Multi-Page Application (MPA). Each page request typically involves a full page refresh, which could be slow and sometimes disruptive to the user experience.
In contrast, a SPA loads the application's initial HTML frame and then makes requests to the server for data as needed. This approach allows for more efficient use of server resources. The server is not constantly sending full HTML pages and can focus on serving data via a well-defined API instead.
Another benefit of the API-centric approach of a SPA is that the same API is often shared with mobile applications, making the backend easier to maintain.
SPAs are often built using JavaScript frameworks like React. These tools provide a set of abstractions and tools for building complex applications that are optimized for performance and maintainability. In contrast, building an MPA requires a more server-centric approach, which can be more challenging to scale and maintain as the application grows in complexity.
The rise in popularity of these client frameworks brought a wide array of production-grade frontend hosting platforms to the market. Some popular examples include Netify and Vercel. Major cloud providers have similar offerings.
These hosting platforms handle the complexity of building and deploying modern frontend applications at scale. Developers check their code into a repo, and the hosting platforms take over from there. They automatically build the web application bundle and its associated resources and distribute them to the CDN.
Since these hosting platforms are built on the cloud and serverless computing foundation, using best practices like serving data at the edge close to the user, they offer practically infinite scale, and there is no infrastructure to manage.
This is what the modern frontend landscape looks like. The frontend application is built with a modern framework like React. The client application is served by a production-grade hosting platform for scale, and it dynamically fetches data from the backend via a well-defined API.
As mentioned in the previous section, the role of a modern backend is to serve a set of well-defined APIs to support the frontend web and mobile applications.
What are the modern options for building a backend? The shift is similarly dramatic.
Let’s see what a small, resource-constrained startup should use to build its backend, and see how far such a backend could scale.
When time-to-market is critical and resources are limited, a startup should offload as much non-core work as possible. Serverless computing options are attractive, for the following reasons:
Serverless computing manages the operational aspects of the backend, such as scaling, redundancy, and failover, freeing the startup team from managing infrastructure.
Serverless computing follows a cost-effective pay-per-use pricing model. There is no up-front commitment.
Serverless computing allows developers to focus on writing code and testing the backend without worrying about managing servers, leading to shorter time to market.
The diagram below shows the common API architectural styles in one picture.
REST Proposed in 2000, REST is the most used style. It is often used between front-end clients and back-end services. It is compliant with 6 architectural constraints. The payload format can be JSON, XML, HTML, or plain text.
GraphQL GraphQL was proposed in 2015 by Meta. It provides a schema and type system, suitable for complex systems where the relationships between entities are graph-like. For example, in the diagram below, GraphQL can retrieve user and order information in one call, while in REST this needs multiple calls.
GraphQL is not a replacement for REST. It can be built upon existing REST services.
Web socket Web socket is a protocol that provides full-duplex communications over TCP. The clients establish web sockets to receive real-time updates from the back-end services. Unlike REST, which always “pulls” data, web socket enables data to be “pushed”.
Webhook Webhooks are usually used by third-party asynchronous API calls. In the diagram below, for example, we use Stripe or Paypal for payment channels and register a webhook for payment results. When a third-party payment service is done, it notifies the payment service if the payment is successful or failed. Webhook calls are usually part of the system’s state machine.
gRPC Released in 2016, gRPC is used for communications among microservices. gRPC library handles encoding/decoding and data transmission.
SOAP SOAP stands for Simple Object Access Protocol. Its payload is XML only, suitable for communications between internal systems.
Over to you: What API architectural styles have you used?
In part 3 of this series, we explored the modern application stack for early-stage startups, including a frontend hosting platform, a serverless API backend, and a serverless relational database tier. This powerful combination has taken us far beyond what we used to be able to do with a single server.
As traffic continues to scale, this modern stack will eventually reach its limits. In this final part of the series, we examine where this modern stack might start to fall apart, and explore strategies to evolve it to handle more traffic.
At a certain threshold, handling the scale and complexity of the application requires an entirely new approach. We discussed microservice architecture briefly earlier in the series. We will build on that with an exploration of how hyper-growth startups can gradually migrate to a microservice architecture by leveraging cloud native strategies.
Well before we run into performance issues, we should have a robust operational monitoring and observability system in place. Otherwise, how would we know when the application starts to fall apart due to rising traffic load?
There are many observability solutions in the market. Building a robust system ourselves is no simple task. We strongly advise buying a solution. Many cloud providers have in-house offerings like AWS Cloud Operations that might be sufficient. Some SaaS offerings are powerful but expensive.
To get an early warning of any system-wide performance issue, we suggest monitoring these critical performance metrics. Keep in mind the list is not exhaustive.
For database:
Read and write request rate.
Query response time. Track p95 and median at the very least.
Database connections. Most databases have a connection limit. Track the number of connections to identify when to scale the database resources to handle increased traffic.
Lock contention. It measures the amount of time a database spends waiting for locks to be released. It helps identify when to optimize the database schema or queries to reduce contention.
Slow queries. It tracks the number of slow queries. It is an early sign of trouble when the rate starts to increase.
For the application tier, the serverless platform should provide these critical metrics:
Incoming request rate.
Response time. It measures the time it takes for an application to respond to a request. Track at least p95 and median.
Error rate. It tracks the percentage of requests that result in an error.
Network throughput. It measures the amount of network traffic the application is generating.
Now that we are armed with the data, let’s see where an application is likely to fall apart first.
Every application is different, but for most applications, the following sections discuss the areas that are likely to show the first signs of cracks.
The first place that could break is the database tier. As traffic grows, the combined read and write traffic could start to overwhelm the serverless relational database. This limit could be quite high. As we discussed before, with a serverless database, the compute tier and the storage tier scale independently. The compute tier transparently scales vertically to a bigger instance as the load increases.
At some point, the metrics would start to deteriorate and it could start to overload the single serverless database. Fortunately, the playbook to scale the database tier is well-known, and it stays pretty much the same as in the early days.
In part 2 of this series, we discussed three strategies for scaling the database tier in the traditional application stack. These strategies are still applicable to the modern serverless stack.
For a read-heavy application, we should consider migrating the read load to read replicas. With this method, we add a series of read replicas to the primary database to handle reads. We can have different replicas handle different kinds of read queries to spread the load.
The drawback of this approach is replication lag. Replication lag refers to the time difference between when a write operation is performed on the primary database and when it is reflected in the read replica. When replication lag occurs, it can lead to stale or inconsistent data being returned to clients when they query the read replica.
Whether this slight inconsistency is acceptable is determined on a case-by-case basis, and it is a tradeoff for supporting an ever-increasing scale. For the small number of operations that cannot tolerate any lags, those reads can always be directed at the primary database.
Another approach to handle the ever-increasing read load is to add a caching layer to optimize the read operations.
Redis is a popular in-memory cache for this purpose. Redis reduces the read load for a database by caching frequently accessed data in memory. This allows for faster access to the data since it is retrieved from the cache instead of the slower database. With fewer read operations performed on the database, Redis reduces the load on the database cluster and enhances its overall scalability. Managed Redis solutions are available through many cloud providers, reducing the burden of operating a caching tier on a day-to-day basis.
Database sharding is a technique used to partition data across multiple database servers based on the values in one or more columns of a table. For example, a large user table can be divided based on user ID, resulting in multiple smaller tables stored on separate database servers. Each server handles a small subset of the rows that were previously managed by the single primary database, leading to improved query performance as each shard handles a smaller subset of data.
However, database sharding has a significant drawback as it adds complexity to both the application and database layers. Managing and maintaining an increasing number of database shards becomes more complex. Application developers need to implement sharding logic in the code to ensure that the correct database shard is accessed for a given query or transaction. While sharding improves query performance, it can also make it harder to perform cross-shard queries or join data from multiple shards. This limitation can restrict the types of queries that can be performed.
Despite these drawbacks, database sharding is a useful technique for improving the scalability and performance of a large database, particularly when vertical scaling is no longer feasible. It is essential to plan and implement sharding carefully to ensure that its benefits outweigh its added complexity and limitations.
Rather than sharding the database, another feasible solution is to migrate a subset of the data to NoSQL. NoSQL databases offer horizontal scalability and high write rates, but at the expense of schema flexibility in the data model. If there is a subset of data that does not rely on the relational model, migrating that subset to a NoSQL database could be an effective approach to scaling the application.
This approach reduces the read load on the relational database, allowing it to focus on more complex queries while the NoSQL database handles high-write workloads. It is important to carefully consider which subset of data is migrated, and the migration process should be planned and executed carefully to avoid data inconsistencies or loss.
SQL statements are executed by the database system in several steps, including:
Parsing the SQL statement and checking its validity
Transforming the SQL into an internal representation, such as relational algebra
Optimizing the internal representation and creating an execution plan that utilizes index information
Executing the plan and returning the results
The execution of SQL is highly complex and involves many considerations, such as:
The use of indexes and caches
The order of table joins
Concurrency control
Transaction management
Over to you: what is your favorite SQL statement?
C, C++, Java, Javascript, Typescript, Golang, Rust…
How do programming languages evolve over the past 70 years?
The diagram below shows a brief history of programming languages.
Perforated cards were the first generation of programming languages. Assembly languages, which are machine-oriented, are the second generation of programming language. Third-generation languages, which are human-oriented, have been around since 1957.
Early languages like Fortran and LISP proposed garbage collection, recursion, exceptions. These features still exist in modern programming languages.
In 1972, two influential languages were born: Smalltalk and C. Smalltalk greatly influenced scripting languages and client-side languages. C language was developed for unix programming.
In the 1980s, object-oriented languages became popular because of its advantage in graphic user interfaces. Object-C and C++ are two famous ones.
In the 1990s, the PCs became cheaper. The programming languages at this stage emphasized on security and simplicity. Python was born in this decade. It was easy to learn and extend and it quickly gained popularity. In 1995, Java, Javascript, PHP and Ruby were born.
In 2000, C# was released by Microsoft. Although it was bundled with .NET framework, this language carried a lot of advanced features.
A number of languages were developed in the 2010s to improve C++ or Java. In the C++ family, we have D, Rust, Zig and most recently Carbon. In the Java family, we have Golang and Kotlin. The use of Flutter made Dart popular, and Typescript was developed to be fully compatible with Javascript. Also, Apple finally released Swift to replace Object-C.
Over to you: What’s your favorite language and why? Will AI change the way we use programming languages?