Topics 23
Last updated
Last updated
Did we miss anything? If yes, Please help to enrich this article by sharing your thoughts in the comments.
Written by Love Sharma, our guest author. We're constantly seeking valuable content, so if you'd like to contribute to our platform or have any previously published content you'd like us to share, please feel free to drop us a message.
Are serverless databases the future? How do serverless databases differ from traditional cloud databases?
Amazon Aurora Serverless, depicted in the diagram below, is a configuration that is auto-scaling and available on-demand for Amazon Aurora.
Aurora Serverless has the ability to scale capacity automatically up or down as per business requirements. For example, an eCommerce website preparing for a major promotion can scale the load to multiple databases within a few milliseconds. In comparison to regular cloud databases, which necessitate the provision and administration of database instances, Aurora Serverless can automatically start up and shut down.
By decoupling the compute layer from the data storage layer, Aurora Serverless is able to charge fees in a more precise manner. Additionally, Aurora Serverless can be a combination of provisioned and serverless instances, enabling existing provisioned databases to become a part of the serverless pool.
Over to you: Have you used a serverless DB? Does it save cost?
Connection-oriented vs. connectionless
Three-way handshake vs. No handshake
Header (20 bytes) vs. (8 bytes)
Point-to-point vs. Unicast & Multicast & Broadcast
Congestion control vs. no congestion control
Reliable vs. lossy
Flow control vs. no flow control
Batch Processing: We aggregate user click activities at end of the day.
Stream Processing: We detect potential frauds with the user click streams in real-time.
Both processing models are used in big data processing. The major differences are:
Input Batch processing works on time-bounded data, which means there is an end to the input data. Stream processing works on data streams, which doesn’t have a boundary.
Timelineness Batch processing is used in scenarios where the data doesn’t need to be processed in real-time. Stream processing can produce processing results as the data is generated.
Output Batch processing usually generates one-off results, for example, reports. Stream processing’s outputs can pipe into fraud decision-making engines, monitoring tools, analytics tools, or index/cache updaters.
Fault tolerance Batch processing tolerates faults better as the batch can be replayed on a fixed set of input data. Stream processing is more challenging as the input data keeps flowing in. There are some approaches to solve this:
a) Microbatching which splits the data stream into smaller blocks (used in Spark); b) Checkpoint which generates a mark every few seconds to roll back to (used in Flink).
Over to you: Have you worked on stream processing systems?
Explain code
Code review and debug
Translate code from one language to another
Learn new language
Write tests
Modify existing code
Comments and dev docs
A good engineer needs to recognize how data structures are used in our daily lives.
list: keep your Twitter feeds
stack: support undo/redo of the word editor
queue: keep printer jobs, or send user actions in-game
heap: task scheduling
tree: keep the HTML document, or for AI decision
suffix tree: for searching string in a document
graph: for tracking friendship, or path finding
r-tree: for finding the nearest neighbor
vertex buffer: for sending data to GPU for rendering
To conclude, data structures play an important role in our daily lives, both in our technology and in our experiences. Engineers should be aware of these data structures and their use cases to create effective and efficient solutions.
Over to you: Which additional data structures have we overlooked?
Message brokers play a crucial role when building distributed systems or microservices to improve their performance, scalability, and maintainability.
Decoupling: Message brokers promote independent development, deployment, and scaling by creating a separation between software components. The result is easier maintenance and troubleshooting.
Asynchronous communication: A message broker allows components to communicate without waiting for responses, making the system more efficient and enabling effective load balancing.
Message brokers ensure that messages are not lost during component failures by providing buffering and message persistence.
Scalability: Message brokers can manage a high volume of messages, allowing your system to scale horizontally by adding more instances of the message broker as needed.
To summarize, a message broker can improve efficiency, scalability, and reliability in your architecture. Considering the use of a message broker can greatly benefit the long-term success of your application. Always think about the bigger picture, and how your design choices will affect the overall project.
We spent a few days analyzing it.
The diagram below shows the detailed pipeline based on the open-sourced algorithm.
The process involves 5 stages:
Candidate Sourcing ~ start with 500 million Tweets
Global Filtering ~ down to 1500 candidates
Scoring & Ranking ~ 48M parameter neural network, Twitter Blue boost
Filtering ~ to achieve author and content diversity
Mixing ~ with Ads recommendation and Who to Follow
The post was jointly created by ByteByteGo and Mem Special thanks Scott Mackie , founding engineer at Mem, for putting this together.
Mem is building the world’s first knowledge assistant. In next week’s ByteByteGo guest newsletter, Mem will be sharing lessons they’ve learned from their extensive work with large language models and building AI-native infrastructure.
In the 1990s, Secure Shell was developed to provide a secure alternative to Telnet for remote system access and management. Using SSH is a great way to set up secure communication between client and server because it uses a secure protocol.
The following happens when you type "ssh hostname":
Hostname resolution: Convert the hostname to an IP address using DNS or the local hosts file.
SSH client initialization: Connect to the remote SSH server.
TCP handshake: Establish a reliable connection.
Protocol negotiation: Agree on the SSH protocol version and encryption algorithms.
Key exchange: Generate a shared secret key securely.
Server authentication: Verify the server's public key.
User authentication: Authenticate using a password, public key, or another method.
Session establishment: Create an encrypted SSH session and access the remote system.
Make sure you always use key-based authentication with SSH for better security, and learn SSH configuration files and options to customize your experience. Keep up with best practices and security recommendations to ensure a secure and efficient remote access experience.
Over to you: can you tell the difference between SSH, SSL, and TLS?