Introduction to Distributed Monitoring
Learn why monitoring in a distributed system is crucial.
We'll cover the following...
Need for monitoring
Let’s go over how the failure of a single service can affect the smooth execution of related systems. To avoid cascading failures, monitoring can play a vital role with early warnings or steering us to the root cause of faults.
Let’s consider a scenario where a user uploads a video, intro-to-system-design
, to YouTube. The UI
service in server A
takes the video information and gives the data to service 2 in server B. Service 2 makes an entry in the database and stores the video in blob storage. Another service, 3, in server C manages the replication and synchronization of databases X and Y.
In this scenario, service 3 fails due to some error, and service 2 makes an entry in the database X. The database X crashes, and the request to fetch a video is routed to database Y. The user wants to play the video intro-to-system-design
, but it will give an error of “Video not found…”
The example above is relatively simple. In reality, complex problems are ...