Parallelism
Get familiarized with the basic concepts of parallelism.
We'll cover the following
What is parallelism?
Most modern microprocessors consist of more than one core, each of which can operate as an individual processing unit. They can execute different parts of different programs at the same time. The features of the std.parallelism
module make it possible for programs to take advantage of all of the cores in order to run faster.
This chapter covers the following range algorithms. These algorithms should be used only when the operations that are to be executed in parallel are truly independent from each other. “In parallel” means that operations are executed on multiple cores at the same time:
-
parallel: Accesses the elements of a range in parallel
-
task: Creates tasks that are executed in parallel
-
asyncBuf: Iterates the elements of an InputRange semi-eagerly in parallel
-
map: Calls functions with the elements of an InputRange semi-eagerly in parallel
-
amap: Calls functions with the elements of a RandomAccessRange fully-eagerly in parallel
-
reduce: Makes calculations over the elements of a RandomAccessRange in parallel
In the programs we have written so far, we have been assuming that the expressions of a program are executed in a certain order, at least in general line- by-line:
++i;
++j;
In the code above, we expect that the value of i
is incremented before the value of j
is incremented. Although that is semantically correct, it is rarely the case in reality. Usually, microprocessors and compilers use optimization techniques to allow some variables to reside in microprocessor registers that are independent from each other. When that is the case, the microprocessor would execute operations, like the increments above, in parallel.
Although these optimizations are effective, they cannot be automatically applied to layers higher than the very low-level operations. Only the programmer can determine that certain high-level operations are independent and that they can be executed in parallel.
In a loop, the elements of a range are normally processed one after the other, with operations of each element following the operations of previous elements:
auto students =
[ Student(1), Student(2), Student(3), Student(4) ];
foreach (student; students) {
student.aSlowOperation();
}
Normally, a program would be executed on one of the cores of the microprocessor, which has been assigned by the operating system to execute the program. As the foreach
loop normally operates on elements one after the other, aSlowOperation()
would be called for each student sequentially. However, in many cases, it is not necessary for the operations of preceding students to be completed before begining the operations of successive students. If the operations on the Student
objects were truly independent, it would be wasteful to ignore the other microprocessor cores, which might potentially be waiting idly on the system.
To simulate long-lasting operations, the following examples call Thread.sleep()
from the core.thread
module. Thread.sleep()
suspends the operations for the specified amount of time. Thread.sleep
is admittedly an artificial method in the following examples because it takes time without ever busying any core. Despite being an unrealistic tool, it is still useful in this chapter to demonstrate the power of parallelism.
Get hands-on with 1400+ tech skills courses.