PBS

From T2B Wiki
Revision as of 09:30, 8 December 2021 by Admin (talk | contribs) (Created page with " === Job submission === To submit a job, you just have to use the '''qsub''' command : <pre>qsub myjob.sh </pre> ''OPTIONS'' *-q queueName : choose the queue you want [mandatory] *-N jobName : name of the job *-I : (capital i) pass in interactive mode *-m mailaddress : set mail address (use in conjonction with -m) : MUST be @ulb.ac.be or @vub.ac.be *-m [a|b|e] : send mail on job status change (a = aborted , b = begin, e = end) *-l&nb...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Job submission

To submit a job, you just have to use the qsub command :

qsub myjob.sh

OPTIONS

  • -q queueName : choose the queue you want [mandatory]
  • -N jobName : name of the job
  • -I : (capital i) pass in interactive mode
  • -m mailaddress : set mail address (use in conjonction with -m) : MUST be @ulb.ac.be or @vub.ac.be
  • -m [a|b|e] : send mail on job status change (a = aborted , b = begin, e = end)
  • -l : resources options
For instance, if you want to use 2 cores: -lnodes=1:ppn=2


Exclamation-mark.jpg

If you want to send more than 2500 jobs to the cluster, write all qsub commands in a text file, and use the script big-submission (more info here).


Exclamation-mark.jpg

If you use MadGraph, read this section first or you risk crashing the cluster.


Exclamation-mark.jpg

If you want to use the GPUs, please read here.

Job management

To see all jobs (running / queued), you can use the qstat command, or go to the JobView page to have a summary of what's running.

qstat

OPTIONS

  • -u username : list only jobs submitted by username
  • -n : show nodes where jobs are running
  • -q : show the job repartition on queues


Job Statistics

All the log files from the batch system are synced every 30 minutes in:

/group/log/torque/

A simple script to analyze the logs and provide some statistics for the user is provided:

torque-user-info.py

Just execute it as is (it is in your $PATH, so executable from everywhere). It will print information like the following:

ID: 6077555  ExCode:   0 Mem:    0M cpuT:      0s wallT:      3s eff:  0.0%   STDIN
ID: 6077602  ExCode:   0 Mem:   50M cpuT:      0s wallT:      2s eff:  0.0%   STDIN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
   user[G]	# Jobs	 <MEM> +- RMS        #HiMem  MAX Mem      <CPU time>    <walltime>    <Eff>   % WT/WT_TOT    # Jobs with Error code (% of user job)
----------------------------------------------------------------------------------------------------------------------------------------------------------------
    rougny[l]	    12	    13 +- 22    MB      0      52 MB  |  00:00:00:00  00:00:00:24  ( 0.0%) (-1.00% of tot) | # EC: 


If you want to test the batch system, you can follow the workbook here

Job Deletion

Use the following command:

qdel <JOBID>

To delete all your jobs, be patient while using the following line:

for j in $(qselect -u $USER);do timeout 3 qdel -a $j;done