14 Mar

ABN Techweek – How Discord Stores Trillions Of Messages?

ABN Tech Week trains engineers on the most important issues in building highly scalable software systems. The topic today is: How Discord Stores Trillions Of Messages?

HOW DISCORD STORES TRILLIONS OF MESSAGES

THE DIAGRAM BELOW SHOWS THE EVOLUTION OF MESSAGE STORAGE AT DISCORD:

MONGODB ➡️ CASSANDRA ➡️ SCYLLADB

IN 2015, THE FIRST VERSION OF DISCORD WAS BUILT ON TOP OF A SINGLE MONGODB REPLICA. AROUND NOV 2015, MONGODB STORED 100 MILLION MESSAGES AND THE RAM COULDN’T HOLD THE DATA AND INDEX ANY LONGER. THE LATENCY BECAME UNPREDICTABLE. MESSAGE STORAGE NEEDS TO BE MOVED TO ANOTHER DATABASE. CASSANDRA WAS CHOSEN.

IN 2017, DISCORD HAD 12 CASSANDRA NODES AND STORED BILLIONS OF MESSAGES.

AT THE BEGINNING OF 2022, IT HAD 177 NODES WITH TRILLIONS OF MESSAGES. AT THIS POINT, LATENCY WAS UNPREDICTABLE, AND MAINTENANCE OPERATIONS BECAME TOO EXPENSIVE TO RUN.

THEN COME SCYLLADB.

IN 2015, THE FIRST VERSION OF DISCORD WAS BUILT ON TOP OF A SINGLE MONGODB REPLICA. AROUND NOV 2015, MONGODB STORED 100 MILLION MESSAGES AND THE RAM COULDN’T HOLD THE DATA AND INDEX ANY LONGER. THE LATENCY BECAME UNPREDICTABLE. MESSAGE STORAGE NEEDS TO BE MOVED TO ANOTHER DATABASE. CASSANDRA WAS CHOSEN.

IN 2017, DISCORD HAD 12 CASSANDRA NODES AND STORED BILLIONS OF MESSAGES.

AT THE BEGINNING OF 2022, IT HAD 177 NODES WITH TRILLIONS OF MESSAGES. AT THIS POINT, LATENCY WAS UNPREDICTABLE, AND MAINTENANCE OPERATIONS BECAME TOO EXPENSIVE TO RUN.

THERE ARE SEVERAL REASONS FOR THE ISSUE:

– CASSANDRA USES THE LSM TREE FOR THE INTERNAL DATA STRUCTURE. THE READS ARE MORE EXPENSIVE THAN THE WRITES. THERE CAN BE MANY CONCURRENT READS ON A SERVER WITH HUNDREDS OF USERS, RESULTING IN HOTSPOTS.

– MAINTAINING CLUSTERS, SUCH AS COMPACTING SSTABLES, IMPACTS PERFORMANCE.

– GARBAGE COLLECTION PAUSES WOULD CAUSE SIGNIFICANT LATENCY SPIKES

SCYLLADB IS CASSANDRA COMPATIBLE DATABASE WRITTEN IN C++. DISCORD REDESIGNED ITS ARCHITECTURE TO HAVE A MONOLITHIC API, A DATA SERVICE WRITTEN IN RUST, AND SCYLLADB-BASED STORAGE.

THE P99 READ LATENCY IN SCYLLADB IS 15MS COMPARED TO 40-125MS IN CASSANDRA. THE P99 WRITE LATENCY IS 5MS COMPARED TO 5-70MS IN CASSANDRA.

10 Mar

ABN Techweek – How does the browser render a web page?

HOW DOES THE BROWSER RENDER A WEB PAGE?

1. PARSE HTML AND GENERATE DOCUMENT OBJECT MODEL (DOM) TREE.

WHEN THE BROWSER RECEIVES THE HTML DATA FROM THE SERVER, IT IMMEDIATELY PARSES IT AND CONVERTS IT INTO A DOM TREE.

2. PARSE CSS AND GENERATE CSSOM TREE.

THE STYLES (CSS FILES) ARE LOADED AND PARSED TO THE CSSOM (CSS OBJECT MODEL).

3. COMBINE DOM TREE AND CSSOM TREE TO CONSTRUCT THE RENDER TREE.

WITH THE DOM AND CSSOM, A RENDERING TREE WILL BE CREATED. THE RENDER TREE MAPS ALL DOM STRUCTURES EXCEPT INVISIBLE ELEMENTS (SUCH AS <HEAD> OR TAGS WITH DISPLAY:NONE; ). IN OTHER WORDS, THE RENDER TREE IS A VISUAL REPRESENTATION OF THE DOM.

4. LAYOUT.

THE CONTENT IN EACH ELEMENT OF THE RENDERING TREE WILL BE CALCULATED TO GET THE GEOMETRIC INFORMATION (POSITION, SIZE), WHICH IS CALLED LAYOUT.

5. PAINTING.

AFTER THE LAYOUT IS COMPLETE, THE RENDERING TREE IS TRANSFORMED INTO THE ACTUAL CONTENT ON THE SCREEN. THIS STEP IS CALLED PAINTING. THE BROWSER GETS THE ABSOLUTE PIXELS OF THE CONTENT.

6. DISPLAY.

FINALLY, THE BROWSER SENDS THE ABSOLUTE PIXELS TO THE GPU AND DISPLAYS THEM ON THE PAGE.

09 Mar

ABN Techweek – HOW DO VISA AND MASTERCARD PREVENT CNP (CARD-NOT-PRESENT) FRAUD

ABN Tech Week trains engineers on the most important issues in building highly scalable software systems. The topic for today is: HOW DO VISA AND MASTERCARD PREVENT CNP (CARD-NOT-PRESENT) FRAUD?

THE DIAGRAM BELOW SHOWS HOW 3-D SECURE PROTOCOL WORKS IN ORDER TO PROTECT ONLINE PURCHASES FROM CNP FRAUD.

3-D SECURE (3DS) PROTOCOL IS AN ADDITIONAL 𝐬𝐞𝐜𝐮𝐫𝐢𝐭𝐲 𝐥𝐚𝐲𝐞𝐫 FOR ONLINE CARDS TRANSACTIONS. IT WAS ORIGINALLY DEVELOPED IN 1999, AND THE LATEST VERSION (VERSION 2) WAS PUBLISHED IN 2016 TO COMPLY WITH NEW EU AUTHENTICATION REQUIREMENTS.

3-D REFERS TO THE “THREE DOMAINS” – THE ACQUIRER DOMAIN, THE ISSUER DOMAIN, AND THE INTEROPERABILITY DOMAIN.

𝐅𝐫𝐢𝐜𝐭𝐢𝐨𝐧𝐥𝐞𝐬𝐬 𝐅𝐥𝐨𝐰 – NORMAL TRANSACTIONS

🔹 STEPS 1-2: A CONSUMER MAKES AN ONLINE PURCHASE AND HITS CHECKOUT TO ENTER PAYMENT CARD DETAILS. IF THE MERCHANT’S WEBSITE ENABLES 3DS, THE 3D SERVER SENDS THE AUTHENTICATION REQUEST TO THE DIRECTORY SERVER (DS).

🔹 STEPS 3-6: BASED ON THE PRIMARY ACCOUNT NUMBER (PAN), DS FINDS THE CARD ISSUER’S ACCESS CONTROL SERVER (ACS) AND CHECKS IF THE CARD IS REGISTERED IN 3DS. THE RESPONSE IS SENT BACK TO DS, MERCHANT, AND CARDHOLDER.

🔹 STEPS 7-10: NOW THAT THE CARD IS AUTHENTICATED, THE CARDHOLDER CAN PROCEED WITH THE PAYMENT REQUEST. THE REQUEST GOES THROUGH THE ACQUIRER AND THE CARD NETWORK AS USUAL.

𝐂𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐞 𝐅𝐥𝐨𝐰 – SUSPICIOUS TRANSACTIONS

🔹 STEPS 1-6: THE STEPS ARE THE SAME WITH THE FRICTIONLESS FLOW. HOWEVER, THE CARDHOLDER IS PROMPTED TO ENTER PROOFS TO VERIFY THE CARD OWNERSHIP.

🔹 STEPS 7-9: THE CARDHOLDER CAN VERIFY VIA OTP (ONE-TIME PASSWORD), SAFETY QUESTIONS AND BIOMETICS. THE RESULTS ARE SENT BACK VIA DS, SO IS MORE SECURE.

🔹 STEPS 10-13: IF THE CARD IS AUTHENTICATED, THE CARDHOLDER CAN CONTINUE WITH THE ONLINE PAYMENT TRANSACTION.

07 Mar

ABN Techweek – What distinguishes MVC, MVP, MVVM, MVVM-C, and VIPER architecture patterns from each other?

ABN Tech Week trains engineers on the most important issues in building highly scalable software systems. The topic today is: What distinguishes MVC, MVP, MVVM, MVVM-C, and VIPER architecture patterns from each other?

WHAT DISTINGUISHES MVC, MVP, MVVM, MVVM-C, AND VIPER ARCHITECTURE PATTERNS FROM EACH OTHER?

THESE ARCHITECTURE PATTERNS ARE AMONG THE MOST COMMONLY USED IN APP DEVELOPMENT, WHETHER ON IOS OR ANDROID PLATFORMS. DEVELOPERS HAVE INTRODUCED THEM TO OVERCOME THE LIMITATIONS OF EARLIER PATTERNS. SO, HOW DO THEY DIFFER?

🔹 MVC, THE OLDEST PATTERN, DATES BACK ALMOST 50 YEARS

🔹 EVERY PATTERN HAS A "VIEW" (V) RESPONSIBLE FOR DISPLAYING CONTENT AND RECEIVING USER INPUT

🔹 MOST PATTERNS INCLUDE A "MODEL" (M) TO MANAGE BUSINESS DATA

🔹 "CONTROLLER," "PRESENTER," AND "VIEW-MODEL" ARE TRANSLATORS THAT MEDIATE BETWEEN THE VIEW AND THE MODEL ("ENTITY" IN THE VIPER PATTERN)

🔹 THESE TRANSLATORS CAN BE QUITE COMPLEX TO WRITE, SO VARIOUS PATTERNS HAVE BEEN PROPOSED TO MAKE THEM MORE MAINTAINABLE

06 Mar

ChatGPT – How did we get here? ABN Techweek

ChatGPT seems to come out of nowhere. Little did we know that it was built on top of decades of research.
1950s
In this stage, people still used primitive models that are based on rules.
1980s
Since the 1980s, machine learning started to pick up and was used for classification. The training was conducted on a small range of data.
1990s – 2000s
Since the 1990s, neural networks started to imitate human brains for labeling and training. There are generally 3 types:
– CNN (Convolutional Neural Network): often used in visual-related tasks.
– RNN (Recurrent Neural Network): useful in natural language tasks
– GAN (Generative Adversarial Network): comprised of two networks(Generative and Discriminative). This is a generative model that can generate novel images that look alike.
2017
“Attention is all you need” represents the foundation of generative AI. The transformer model greatly shortens the training time by parallelism.
2018 – Now
In this stage, due to the major progress of the transformer model, we see various models train on a massive amount of data. Human demonstration becomes the learning content of the model. We’ve seen many AI writers that can write articles, news, technical docs, and even code. This has great commercial value as well and sets off a global whirlwind.

04 Mar

ABN Techweek – What are the API architectural styles?

THE DIAGRAM BELOW SHOWS THE COMMON API ARCHITECTURAL STYLES IN ONE PICTURE.

🔹 1. REST

PROPOSED IN 2000, REST IS THE MOST USED STYLE. IT IS OFTEN USED BETWEEN FRONT-END CLIENTS AND BACK-END SERVICES. IT IS COMPLIANT WITH 6 ARCHITECTURAL CONSTRAINTS. THE PAYLOAD FORMAT CAN BE JSON, XML, HTML, OR PLAIN TEXT.

🔹 2. GRAPHQL

GRAPHQL WAS PROPOSED IN 2015 BY META. IT PROVIDES A SCHEMA AND TYPE SYSTEM, SUITABLE FOR COMPLEX SYSTEMS WHERE THE RELATIONSHIPS BETWEEN ENTITIES ARE GRAPH-LIKE. FOR EXAMPLE, IN THE DIAGRAM BELOW, GRAPHQL CAN RETRIEVE USER AND ORDER INFORMATION IN ONE CALL, WHILE IN REST THIS NEEDS MULTIPLE CALLS.

GRAPHQL IS NOT A REPLACEMENT FOR REST. IT CAN BE BUILT UPON EXISTING REST SERVICES.

🔹 3. WEB SOCKET

WEB SOCKET IS A PROTOCOL THAT PROVIDES FULL-DUPLEX COMMUNICATIONS OVER TCP. THE CLIENTS ESTABLISH WEB SOCKETS TO RECEIVE REAL-TIME UPDATES FROM THE BACK-END SERVICES. UNLIKE REST, WHICH ALWAYS “PULLS” DATA, WEB SOCKET ENABLES DATA TO BE “PUSHED”.

🔹 4. WEBHOOK

WEBHOOKS ARE USUALLY USED BY THIRD-PARTY ASYNCHRONOUS API CALLS. IN THE DIAGRAM BELOW, FOR EXAMPLE, WE USE STRIPE OR PAYPAL FOR PAYMENT CHANNELS AND REGISTER A WEBHOOK FOR PAYMENT RESULTS. WHEN A THIRD-PARTY PAYMENT SERVICE IS DONE, IT NOTIFIES THE PAYMENT SERVICE IF THE PAYMENT IS SUCCESSFUL OR FAILED. WEBHOOK CALLS ARE USUALLY PART OF THE SYSTEM’S STATE MACHINE.

🔹 5. GRPC

RELEASED IN 2016, GRPC IS USED FOR COMMUNICATIONS AMONG MICROSERVICES. GRPC LIBRARY HANDLES ENCODING/DECODING AND DATA TRANSMISSION.

🔹 6. SOAP

SOAP STANDS FOR SIMPLE OBJECT ACCESS PROTOCOL. ITS PAYLOAD IS XML ONLY, SUITABLE FOR COMMUNICATIONS BETWEEN INTERNAL SYSTEMS.

03 Mar

A guide to CDN (Content Delivery Network) – ABN Techweek

CDNs are distributed server networks that help improve the performance, reliability, and security of content delivery on the internet.

Here is the Overall CDN Diagram explains:

Edge servers are located closer to the end user than traditional servers, which helps reduce latency and improve website performance.

Edge computing is a type of computing that processes data closer to the end user rather than in a centralized data center. This helps to reduce latency and improve the performance of applications that require real-time processing, such as video streaming or online gaming.

Cloud gaming is online gaming that uses cloud computing to provide users with high-quality, low-latency gaming experiences.

Together, these technologies are transforming how we access and consume digital content. By providing faster, more reliable, and more immersive experiences for users, they are helping to drive the growth of the digital economy and create new opportunities for businesses and consumers alike.

02 Mar

Load balancer and API gateway – ABN Techweek

1️⃣ NLB (NETWORK LOAD BALANCER) IS USUALLY DEPLOYED BEFORE THE API GATEWAY, HANDLING TRAFFIC ROUTING BASED ON IP. IT DOES NOT PARSE THE HTTP REQUESTS.

2️⃣ ALB (APPLICATION LOAD BALANCER) ROUTES REQUESTS BASED ON HTTP HEADER OR URL AND THUS CAN PROVIDE RICHER ROUTING RULES. WE CAN CHOOSE THE LOAD BALANCER BASED ON ROUTING REQUIREMENTS. FOR SIMPLE SERVICES WITH A SMALLER SCALE, ONE LOAD BALANCER IS ENOUGH.

3️⃣ THE API GATEWAY PERFORMS TASKS MORE ON THE APPLICATION LEVEL. SO IT HAS DIFFERENT RESPONSIBILITIES FROM THE LOAD BALANCER.

Call Now Button