...

/

Introduction to Distributed Monitoring

Introduction to Distributed Monitoring

Learn why monitoring in a distributed system is crucial.

Need for monitoring

Let’s go over how the failure of a single service can affect the smooth execution of related systems. To avoid cascading failures, monitoring can play a vital role with early warnings or steering us to the root cause of faults.

Let’s consider a scenario where a user uploads a video, intro-to-system-design, to YouTube. The UI service in server A takes the video information and gives the data to service 2 in server B. Service 2 makes an entry in the database and stores the video in blob storage. Another service, 3, in server C manages the replication and synchronization of databases X and Y.

In this scenario, service 3 fails due to some error, and service 2 makes an entry in the database X. The database X crashes, and the request to fetch a video is routed to database Y. The user wants to play the video intro-to-system-design, but it will give an error of “Video not found…”

Press + to interact
The user uploads a video on YouTube
1 / 15
The user uploads a video on YouTube

The example above is relatively simple. In reality, complex problems are ...