Design of Distributed Task Scheduler

Explore and connect design components of the distributed task scheduler.

We identified requirements for a distributed task scheduler in the previous lesson. In this lesson, we will map those requirements to a design.

Components

We can consider scheduling at many levels. It could be scheduling that is done internally by an organization to run their tasks on their own cluster of machines, and they have to find ample resources and need to decide which task to run first. On the other hand, It could be scheduling that a cloud provider uses to schedule tasks coming from multiple clients. Cloud providers need to decide which task to run first and which clients to handle first to provide appropriate isolation between different tenants. So, in general, the big components of the system that we have are:

Clients that initiate task execution.
Resources on which the task actually executes.
A scheduler between clients and resources that decides which task should get the resource first.

As shown in the above illustration, it is necessary to put the incoming tasks into a queue. It is because of the following reasons:

We might not have sufficient resources available right now.
There is task dependency, and some tasks need to wait for others.
We need to de-couple the clients from the task execution so that they can hand off work to our system. Our system then queues it for execution.

Let’s design a task scheduling system that should be able to schedule any tasks. Often many tasks are relatively short-lived (from seconds to minutes). For long-running tasks, we might need the ability of periodic checksumming and restoring at the application level to recover from possible failures. We assume that each task’s computational needs can be met by some single server in our fleet. For tasks that need many servers, ...

Create a free account to access the full course.

By signing up, you agree to Educative's Terms of Service and Privacy Policy

Introduction

Abstractions

Non-functional System Characteristics

Back-of-the-Envelope Calculations

Building Blocks

Domain Name System (DNS)

Sequencer

Rate Limiter

Distributed Cache

Blob Store

Content Delivery Network (CDN)

Load Balancers

Key-Value Store

Distributed Messaging Queue

Pub-sub

Distributed Task Scheduler

Distributed Search

Distributed Logging

Distributed Monitoring

Monitoring Server Side Errors

Monitoring Client Side Errors

Databases

Sharded Counters

Concluding Building Blocks

Design YouTube

Design Quora

Design Google Maps

Designing a Proximity Server like Yelp

Design Uber

Design Twitter

Newsfeed System

Design Instagram

Design URL Shortening Service / TinyURL

Design a Web Crawler

Design WhatsApp

Design Typeahead Suggestion

Design Collaborative Document Editing Service / Google Docs

Spectacular Failures

Concluding Remarks

Appendix: System Design Interviews

All content below this will likely go away

Design Exercises

Archived temporary lessons

Design Resource Allocator for a Large Datacenter

Design Zoom

Continuous Monitoring using Data Processing

Design Live Commenting at Facebook

Security

For Noor: Placeholder for Illustration Making

Appendix

Backup of our Lessons

Caching Billions of Tiny Objects on Flash

Design Quora

Copy-Design YouTube

Identity & Access Management

Copy of CDN (02-03-2022)

Design of Distributed Task Scheduler

Components