|
|
|
|
|
Submitting
TORQUE/MAUI
TORQUE stands for Terascale Open-Source Resource and QUEue Manager. It is an Open Source distributed resource manager originally based on OpenPBS,
the Portable Batch System (PBS) Our
installation is used for running parallel
jobs or making use of dedicated
reservations. We use a separate program
called Maui
for scheduling jobs in TORQUE, but users have
no interaction with it so we will make no
further mention. If you have been given an
account on the cluster, than you probably
need PBS for running your jobs.
Use of the cluster through PBS is dictated
by a policy that is enforced by PBS and
Maui. Currently, jobs are not limited in
terms of the numbers of nodes you can use.
However, there is a fixed limit on the
length of jobs, and any user will not have
more than a certain number of jobs running.
At the time of this writing, there is a
96 hour limit on the time allowed for any
jobs, and a maximum of ten jobs running for
any user. This time is the "wall
clock" time. That is, the amount of
time elapsed irrespective of how it is
used. This contrasts with CPU time, which
is only counted if the job is actually
running on the processor. We do not use CPU
time for policy enforcement.
Please make sure /opt/UMtorque/bin is in front of /usr/local/bin in your PATH environment variable. We will make it default after we upgrade all of cluster to use TORQUE.
To see the current policy on the cluster,
you can use the qmgr(8)
command:
[xhe@brood00 ~]$ qmgr -c " p s"
#
# Create queues and set their attributes.
#
#
# Create and define queue dque
#
create queue dque
set queue dque queue_type = Execution
set queue dque resources_max.cput = 04:00:00
set queue dque resources_max.walltime = 02:00:00
set queue dque resources_min.cput = 00:00:01
set queue dque resources_default.cput = 04:00:00
set queue dque resources_default.nodes = 1:ppn=1
set queue dque resources_default.walltime = 01:00:00
set queue dque max_user_run = 2
set queue dque enabled = True
set queue dque started = True
#
# Create and define queue long
#
create queue long
set queue long queue_type = Execution
set queue long acl_user_enable = True
set queue long resources_max.cput = 192:00:00
set queue long resources_max.walltime = 96:00:00
set queue long resources_min.cput = 00:00:01
set queue long resources_default.cput = 192:00:00
set queue long resources_default.nodes = 1:ppn=1
set queue long enabled = True
set queue long started = True
#
# Set server attributes.
#
set server scheduling = True
set server managers = root@queen.umiacs.umd.edu
set server operators = root@queen.umiacs.umd.edu
set server default_queue = dque
set server log_events = 511
set server mail_from = adm
set server query_other_jobs = True
set server scheduler_iteration = 600
set server node_check_rate = 600
set server tcp_timeout = 6
set server pbs_version = 2.1.8
[xhe@brood00 ~]$
This command starts the queue management
command for PBS. You cannot manipulate the
queue from here, but you can inspect it.
Here we print out the configuration for
the dque queue. The dque
queue is the default -- there are other
queues, but their use is out of the scope of
this document. Here,
the resources_max.walltime value
tells us the current maximum walltime for a
job, and the max_user_run property
tells us the maximum number of jobs that
will run for any user at any time.
Aside from qmgr , which you would
only use for inspecting the current policy,
there are several commands that you will use
for submitting, inspecting, and controlling
jobs. The following is by no means a
complete reference. Unfortunately, there is
not a lot of documentation available online.
You should look at the man pages if you have
further questions.
qstat
The qstat(1B) command is used
for querying the status of the queue, as
well as the status of individual jobs.
For the most part, you will be invoking
the qstat command without
arguments to examine the state of the
entire queue. However, one can specify
one or more jobs on the command line to
pick one out in particular, or give
additional flags such as -n
or -f to get allocated node
information, or full job information,
respectively. The curious should consult
the man page for more information.
Here are some examples of the use and
output of qstat . Assume that
I have already submitted a job,
identified by 11216.queen , and
it has not run yet:
[bargle@brood01 factor]$ qstat
Job id Name User Time Use S Queue
---------------- ---------------- ---------------- -------- - -----
11216.queen STDIN bargle 0 Q dque
The output of this command can be
interpreted as follows:
- Job id is the PBS identifier for the
job. This is unique in the queue. In
this case,
11216.queen
indicates that my job is the 11216th
job submitted to queen , the
host where the PBS service runs
- Name is the name of the script that
was submitted. This is not unique.
In this case,
STDIN indicates
that I piped the script directly to
the submission program instead of
using a persistent script on disk.
This is a useful but rarely used
technique.
- User is the UNIX username of the
user who submitted the job.
User
bargle is my
username.
- Time Use is the amount of CPU time
accumulated by the job. No time has
been used by this job, because it is
still queued.
- "S" is the current state
of the job. "Q" indicates
that the job is queued. State
"R" indicates that the job is
running.
- Queue is the name of the queue where
the job has been submitted. This will
almost always be
dque .
Now, the job has been scheduled to run,
but the PBS service has not accounted
any CPU time use for the job yet:
[bargle@brood01 factor]$ qstat
Job id Name User Time Use S Queue
---------------- ---------------- ---------------- -------- - -----
11216.queen STDIN bargle 0 R dque
Here the job has started to accumulate
CPU time:
[bargle@brood01 factor]$ qstat
Job id Name User Time Use S Queue
---------------- ---------------- ---------------- -------- - -----
11216.queen STDIN bargle 00:00:13 R dque
Finally, after the job has finished
executing (note that there is no output,
since the queue is empty):
[bargle@brood01 factor]$ qstat
[bargle@brood01 factor]$
In the directory that was current when
the job was submitted, PBS also left the
results of output to stdout
and stderr . They are
called STDIN.o11216
and STDIN.e11216 respectively.
We will go over the output of PBS a
little more, later.
qsub
The qsub(1B) program is used
for submitting jobs to PBS. It has two
primary modes of use: interactive jobs,
and batch jobs. Interactive jobs are
useful for testing your programs, but
not very useful for running many jobs
since it requires your input. We will
look at interactive jobs first. The
following command asks for two nodes and
sixty seconds (-l
nodes=2,walltime=60 ) in interactive
mode (-I ). Here, after I get
my allocation, I look at the contents of
the $PBS_NODEFILE (which lists the nodes
I have allocated) and exit:
[bargle@brood01 factor]$ qsub -l nodes=2,walltime=60 -I
qsub: waiting for job 11212.queen.umiacs.umd.edu to start
qsub: job 11212.queen.umiacs.umd.edu ready
[bargle@bug60 ~]$ cat $PBS_NODEFILE
bug60
bug59
[bargle@bug60 ~]$ exit
logout
qsub: job 11212.queen.umiacs.umd.edu completed
[bargle@brood01 factor]$
Next, we submit a job from a script to
use the pbsdsh program to run a
process on all allocated nodes. The
script, called helloworld.qsub ,
is as follows:
#!/bin/bash
# Set up the path
PATH=/usr/local/bin:$PATH
export PATH
# Make all hosts print out "Hello World"
pbsdsh echo Hello World
To submit the job:
[bargle@brood01 examples]$ qsub -l nodes=4 helloworld.qsub
11220.queen.umiacs.umd.edu
[bargle@brood01 examples]$
When a job finishes, PBS drops two
output files in the directory that was
current when the job was submitted.
These files are named for the script and
the job number. In this case, the files
are called
helloworld.qsub.o11220 and
helloworld.qsub.e11220 for the
standard output and standard error,
respectively. The error file is empty,
but here is the result of the output:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
Hello World
Hello World
Hello World
Hello World
The warning in the first two lines of
the output is innocuous, and occurs in
every output file from PBS. The next
four lines are the result of "Hello
World" being printed out from the
four nodes where the job was scheduled,
as a result of the pbsdsh
command. There are more examples in
the next section.
qdel
The qdel(1B) program is used
for deleting jobs from the queue when
they are in the queued state. For
example:
[bargle@brood01 examples]$ qstat 11222.queen.umiacs.umd.edu
Job id Name User Time Use S Queue
---------------- ---------------- ---------------- -------- - -----
11222.queen STDIN bargle 0 Q dque
[bargle@brood01 examples]$ qdel 11222
[bargle@brood01 examples]$ qstat
[bargle@brood01 examples]$
qsig
The qsig(1B) program can be
used to send UNIX signals to running
jobs. For instance, it can be used to
kill running jobs:
[bargle@brood01 examples]$ qstat
Job id Name User Time Use S Queue
---------------- ---------------- ---------------- -------- - -----
11221.queen STDIN bargle 00:00:01 R dque
[bargle@brood01 examples]$ qsig -s TERM 11221
[bargle@brood01 examples]$ qstat
[bargle@brood01 examples]$
pbsnodes
The pbsnodes(1B) program can
be used to inspect the state of the
nodes. It can be used to examine
offline nodes, or all nodes. To list
all offline nodes:
[bargle@brood01 examples]$ pbsnodes -l
bug63 offline
[bargle@brood01 examples]$
To examine all nodes:
[bargle@brood01 examples]$ pbsnodes -a
bug00
state = free
np = 2
ntype = cluster
bug01
state = free
np = 2
ntype = cluster
... deleted ...
bug62
state = free
np = 2
ntype = cluster
bug63
state = offline
np = 2
ntype = cluster
[bargle@brood01 examples]$
Condor
Condor is used for high-throughput computing.
It does not deal well with jobs that require
parallel access to more than one machine, so
it is generally only used for serial jobs.
Among other things, Condor supports I/O
redirection and automatic checkpointing to add
a level of fault tolerance to computing, as
well as letting jobs get pre-empted and move
from machine to machine. Jobs in Condor will
get pre-empted by jobs scheduled through PBS,
or if the job runs too long and there are
others waiting. We have local documentation
and examples, both introductory,
and for running Matlab
code under Condor. There is extensive
documentation available online.
|
|
|
|
|
|
|
|