Introduction to PBS
What is the PBS?
Portable Batch System (PBS) is the name of computer software that performs job scheduling. Its primary task is to allocate computational tasks, i.e., batch jobs, among the available computing resources. The following versions of PBS are currently available:
- OpenPBS
- TORQUE
- PBS Professional (PBS Pro)
A brief history of the PBS
Before the emergence of clusters, the Unix-based Network Queuing System (NQS
) from NASA Ames Research Center was a commonly used batch-queuing system. Then with the emergence of
parallel distributed system, NQS
began to show its limitations. Consequently, Ames then led an
effort to develop requirements and specifications for a newer, cluster-compatible system. With NASA’s funding this efforts resulted into PBS in the early 1990s. In 2003, PBS was acquired by Altair Engineering and is now marketed as PBS Pro by Altair Grid Technologies, a subsidiary of Altair Engineering.
However, PBS Pro is now an open source project and has become a part of the OpenHPC software stack. It is readily available for download from pbspro.org along with its full source code.
PBS Pro’s key features
-
Scalability: supports millions of cores with fast job dispatch and minimal latency; tested beyond 50,000 nodes
-
Policy-Driven Scheduling: meets unique site goals and SLAs by balancing job turnaround time and utilization with optimal job placement
-
Resiliency: includes automatic fail-over architecture with no single point of failure – jobs are never lost, and jobs continue to run despite failures
-
Flexible Plugin Framework: simplifies administration with enhanced visibility and extensibility; customize implementations to meet complex requirements
-
Health Checks: monitors and automatically mitigates faults with a comprehensive health check framework
Get hands-on with 1400+ tech skills courses.