Mastering Functional Programming with OCaml and Haskell/

...

Stream-based Dataflow Programming

Understand how streams allow us to formulate dataflow programs that work with signals of infinite elements.

We'll cover the following...

Limitations of list-based dataflow programming
Delay evaluation
Stream is a delayed list
Higher-order functions on streams
Stream-based dataflow programming
Infinite lists in non-strict languages

Limitations of list-based dataflow programming

The functional paradigm allows us to formulate programs in the dataflow style by composing function lists, such as map, filter, and fold. Unfortunately, list-based dataflow programming in OCaml has a severe limitation—lists are finite.

To illustrate, suppose we implement a function called first_prime_greater_equal. This function returns the first prime number greater than or equal to a given n. We can attempt to formulate a dataflow program to solve this problem with the following steps:

Enumerate integers greater than or equal n
Filter the list, keeping only prime numbers
Take the head of the resulting list

In other words, we attempt to formulate the following dataflow diagram:

To differentiate an infinite list from a finite one, we will write infinitely many elements in triangle brackets, e.g., <100; 101; 102; ...>.

Unfortunately, as elegant as the dataflow diagram is, we can not formulate it. The enumerate and filter components need to work with signals of infinitely many elements. However, OCaml lists can only hold a finite number of elements. To see why, let’s try to define a list of infinite natural numbers: <0; 1; 2; ...> with the following recursive function:

let rec naturals_from n = n :: naturals_from (n + 1)

We applythat function to 0 to obtain the sequence of naturals ( 0, 1, 2,...).

let naturals = naturals_from 0

Press + to interact

If we run the code above, we’ll see a stack overflow exception. This is because when we try to apply naturals_from 0, it calls naturals_from 1, which calls naturals_from 2, and so forth. This continues until the stack limit is reached. In other words, because OCaml is a strict language, the hd :: tl list constructor always evaluates the two arguments, hd and tl, regardless of whether their values are needed or not.

To circumvent the problem of infinite elements, we can define a first_prime_between function that returns the first prime number between a and b. Since the [a, b] range is finite, we can formulate a list-based dataflow program to solve this problem.

Press + to interact

OCaml

(** [enumerate_integers a b] returns a list of integers between [a] and [b]. *)
let rec enumerate_integers a b = if a > b then [] else a :: enumerate_integers (a + 1) b
(** [is_prime n] is [true] if [n] is a prime or [false] otherwise. *)
let rec is_prime n =
    if n < 2 then false else if n = 2 then true else
        let rec aux m = if n mod m = 0 then false
                        else if m * m > n then true else aux (m + 1) in aux 2
(** [first_prime_between a b] is the first prime number between [a] and [b]. *)
let first_prime_between a b = enumerate_integers a b |> List.filter is_prime |> List.hd
(** Printing the result of 'first_prime_between 100 1000'. *)
let _ = print_int (first_prime_between 100 1000)

Introduction

Expressions: Building Blocks of Functional Programs

Building Abstractions with Functions

Compound Data Types

Common Computation Patterns

Dataflow Programming with Functions

Applying Functional Programming to Various Domains

Conclusion

Appendix

Stream-based Dataflow Programming

Limitations of list-based dataflow programming