Home page Models Systems Tools IT staff References
PBS is a batch queing system used to manage jub submission con centralized computing resources, such as clusters and super computers. There are open source versions and commercial versions with enhancements for managing large systems with collections of many compute resources and with many users. Here is the PBS main web site for both open source and commercial versions of PBS. The open source TORQUE Resource Manager implementation of PBS is the version most suited to our in house lab needs.
Currently we are using PBS only on systems managed by other units, such as Axiom and Flux. However, we may install it at some point on our own equipment for doing small scale runs, such as small LHS runs. This would make running those types of jobs easier, since we now performing cpu "scheduling" by opening several terminal windows on one or more systems and running a script in each terminal window to run a subset of a set of runs. We've been informed that installing and configuring an open source version of PBS should take about an afternoon.
sudo apt-get update
sudo apt-get -y install tool-package-name
PBS has command line commands for submitting jobs (qsub), getting job status (qstat) and for deleting jobs (qdel), among others. To submit a job you first create a PBS script and then submit that script. A PBS script performs any setup necessary, such as copying input files from the head node to scratch disk space on a compute node, running a program and copy result files from scratch disk space on a compute node to permanent disk space on the head node. There is a nice overview of PBS at the CCMB web site.