Operating Systems: Virtualization, Concurrency & Persistence/

...

The Producer/Consumer (Bounded Buffer) Problem

This lesson introduces the famous producer/consumer problem and presents an attempt to solve it.

We'll cover the following...

- The problem
- A broken solution
- - Explanation

The next synchronization problem we will confront in this chapter is known as the producer/consumer problem, or sometimes as the bounded buffer problem, which was first posed by Dijkstra“Information Streams Sharing a Finite Buffer” by E.W. Dijkstra. Information Processing Letters 1: 179180, 1972. Available: http://www.cs.utexas.edu/users/EWD/ewd03xx/EWD329.PDF The famous paper that introduced the producer/consumer problem.. Indeed, it was this very producer/consumer problem that led Dijkstra and his co-workers to invent the generalized semaphore, which can be used as either a lock or a condition variable“My recollections of operating system design” by E.W. Dijkstra. April, 2001. Avail- able: http://www.cs.utexas.edu/users/EWD/ewd13xx/EWD1303.PDF. A fascinating read for those of you interested in how the pioneers of our field came up with some very basic and fundamental concepts, including ideas like “interrupts” and even “a stack”!. We will learn more about semaphores later.

The problem

Imagine one or more producer threads and one or more consumer threads. Producers generate data items and place them in a buffer; consumers grab said items from the buffer and consume them in some way.

This arrangement occurs in many real systems. For example, in a multi-threaded web server, a producer puts HTTP requests into a work queue (i.e., the bounded buffer). While, the consumer threads take requests out of this queue and process them.

A bounded buffer is also used when you pipe the output of one program into another, e.g., grep foo file.txt | wc -l. This example runs two processes concurrently; grep writes lines from file.txt with the string foo in them to what it thinks is standard output; the UNIX shell redirects the output to what is called a UNIX pipe (created by the pipe system call). The other end of this pipe is connected to the standard input of the process wc, which simply counts the number of lines in the input stream and prints out the result. Thus, the grep process is the producer; the wc process is the consumer; between them is an in-kernel bounded buffer; you, in this example, are just the happy user.

Because the bounded buffer is a shared resource, we must, of course, require synchronized access to it, lestThis is where we drop some serious Old English on you, and the subjunctive form. a race condition arise. To begin to understand this problem better, let’s examine some actual code.

The first thing needed is a shared buffer, into which a producer puts data, and out of which a consumer takes data. Let’s just use a single integer for simplicity (you can certainly imagine placing a pointer to a data structure into this slot instead), and the two inner routines to put a value into the shared buffer, and to get a value out of the buffer. See the code excerpt below for details.

Press + to interact

Pretty simple, no? The put() routine assumes the buffer is empty (and checks this with an assertion), and then simply puts a value into the shared buffer and marks it full by setting count to 1. The get() routine does the opposite, setting the buffer to empty (i.e., setting count to 0) and returning the value. Don’t worry that this shared buffer has just a single entry; later, you’ll generalize it to a queue that can hold multiple entries, which will be even more fun than it sounds.

Now you need to write some routines that know when it is OK to access the buffer to either put data into it or get data out of it. The conditions for this should be obvious: only put data into the buffer when count is zero (i.e., when the buffer is empty), and only get data from the buffer when count is one (i.e., when the buffer is full). If you write the synchronization code such that a producer puts data into a full buffer, or a consumer gets data from an empty one, you have done something wrong (and in this code, an assertion will fire).

This work is going to be done by two types of threads, one set of which is called the producer threads, and the other set, consumer threads. The code excerpt below shows the code for a producer that puts an integer into the shared buffer loops number of times, and a consumer that gets the data out of that shared buffer (forever), each time printing out the data ...