Introduction to Slurm
The Slurm Workload Manager (formerly known as Simple Linux Utility for Resource Management or SLURM), is a free and open-source job scheduler for Linux and Unix-like kernels, used by many of the world’s supercomputers and computer clusters. It provides three key functions.
- First, it allocates exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work.
- Second, it provides a framework for starting, executing, and monitoring work (typically a parallel job such as
MPI
) on a set of allocated nodes. - Finally, it arbitrates contention for resources by managing a queue of pending jobs.
Slurm is the workload manager on about 60% of the TOP500 supercomputers, including Tianhe-2 that, until 2016, was the world’s fastest computer.
History
Slurm began development as a collaborative effort primarily by Lawrence Livermore National Laboratory, SchedMD, Linux NetworX, Hewlett-Packard, and Groupe Bull as a Free Software resource manager in the 2010s. It was inspired by the closed source Quadrics RMS and shares a similar syntax. The name is a reference to the soda in Futurama!
Components of Slurm workload manager
Get hands-on with 1400+ tech skills courses.