Feature #8: Distributed Process Coordinator

Implement the "Distributed Process Coordinator" feature for our "Search Engine" project.

Description

For this search engine feature, we will implement a distributed process coordinator. In distributed computing, a coordinator is the organizer of a task that is distributed among nodes. Our distributed process coordinator is responsible for breaking a task into multiple subtasks, assigning tasks among different worker nodes, and monitoring their progress. We want to implement fault tolerance, so that if one or more worker node(s) fail, our search engine can continue working without interruption. To implement fault tolerance, we will implement a snapshot functionality to save the current progress of worker nodes.

We have n worker nodes. Each node will have a state, which will be the number of subtasks that the node has successfully executed. In the beginning, the state of each node should be 0. We can change the progress state for each node by using the setState(idx, state) function. This function will take two parameters. idx is the index of the node whose progress we are setting, whereas state is the new state of that node. We should also be able to take a snapshot of the nodes at any time. This means that we should be able to save the current state of the nodes at any given time. To implement this, we need to create a snap() function. This function will not take any parameters and will return the snapId. The snapId counts the number of times that the snaps were taken.

We should also be able to access the state of any node at any given time, by using the fetchState(idx, snapId) function. This function returns the state for the node idx, which is taken at the snapshot snapId.

Suppose that we have three nodes, as shown in the illustration below. Initially, the state of all the nodes will be 0. After calling the setState(1, 4) function, the state of node1 will change to 4. If we take a snapshot, at this point, the current state of all nodes will be saved against the snapshot id 0. Now, if we call setState(1, 7), the current state for node1 will change to 7. Now, if we call fetchState(1, 0) function, we will get node1’s state from snapshot 0, that is 4.

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.