In usual scenarios, data processing and computation are performed sequentially. However, sequential computation can be inefficient and may take a lot of computation space and time for large amounts of data or intensive computations.
Parallel computing allows the processing of different data clusters simultaneously on different resources. These resources may be hardware accelerators like GPUs and TPUs, or a cluster of CPUs.
Multi-threading in Julia allows parallel computation by forking the tasks into different branches. These branches are assigned to different threads, which then process them using different CPU cores.
Multi-threading can be performed using the following commands:
@threads
: This branches the task to all threads available.
@spawn
: This command dynamically assigns the available resources to the task being executed.
To understand multi-threading better, consider the following example, which uses Julia=1.8.1
.
using Base.Threadsfunction factorial(n)if (n==1)return nelsereturn n*factorial(n-1)endend# Find the number of threadsnum_threads = nthreads()println("Number of threads: $num_threads")#Threaded loopprintln("------THREADING------")a = zeros(10)@time beginThreads.@threads for i = 1:10a[i] = Threads.threadid()endendprintln("$a")# Spawningprintln("\n-------SPAWN-------")@time beginf1 = Threads.@spawn factorial(7)endprintln(f1.result)
In this code example:
Line 12–13: We print the number of threads in the interface. The default number of threads is 1, and 3 threads have been assigned by running the main.jl
file using the julia --threads 3 main.jl
command.
Lines 15–23: We perform multi-threading using the @threads
command:
Line 17: We declare a zeros array of size 10
.
Lines 18 and 22: We record the execution time using the @time begin ... end
command.
Line 19: We execute the multi-threaded for
loop using the Threads.@threads
command.
Line 20: We obtain the thread ID for the loop iteration using the Threads.threadid()
function. It is stored at the respective array index.
Line 23: We print the final array. Notice that, in the output, the array is evenly divided into three threads.
Lines 26–30: We perform multi-threading using the @spawn
command:
Lines 27 and 29: We record the execution time using the @time begin ... end
command.
Line 28: We evaluate the factorial()
function using the Threads.@spawn
command.
Line 30: We evaluate the result of the spawn
job using the f1.result
command.
Note: The example is just a simple demonstration of multi-threading in Julia. In actual programming, this requires meticulous programming practices to avoid erroneous computation.
Distributed computing employs multiple CPUs in a cluster for computation. Julia provides distributed computing using the Distributed
package.
To understand distributed computing better, consider the following example which uses Julia=1.8.1
. The code file is run using the julia -p 2 main.jl
command, where the -p n
tag automatically assigns n
resources for distributed computing.
function factorial(n)if (n==1)return nelsereturn n*factorial(n-1)endend@time beginf1 = factorial(7)f2 = factorial(8)end
Julia also provides GPU support for parallel computation. Besides providing GPU support for different computational tasks, Julia has dedicated libraries for numerical computation, which directly leverage GPUs for their functioning. The most supported GPU is NVIDIA, but support for several other GPUs is also available. These GPUs are different in their performance specifications and the types of tasks they provide support for.
Free Resources