PBS TMPDIR

From T2B Wiki
Revision as of 12:28, 26 August 2015 by Maintenance script (talk | contribs) (Created page with " === The PBS TMPDIR patch === When using shared NFS homedirectories on a large number of nodes, heavy load on the cluster leads to overload on the central NFS-server and bad ...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

The PBS TMPDIR patch

When using shared NFS homedirectories on a large number of nodes, heavy load on the cluster leads to overload on the central NFS-server and bad performance. The approach of shared homedirectories is ofcourse very esay to setup and also necessary for MPI.

with thanks to David Groep:

Using the fact that TMPDIR points to a local (per-node) scratch directory,
we have been applying a patch to the original pbs manager to select,
based on the #nodfes requested by the job and the job type, the
current working directory of the job. Patch is attached to this mail
(it's just a few lines).
What it does:
  • if the job is of type "mpi", or if the type is "multiple" and the
  number of requested nodes > 1, the bevahviour of the pbs job manager
  is un-altered.
  • if the job type is "single", or the type is "multiple" and the
  job requests 0 or 1 nodes, the following statement is inserted
  in the PBS job script, just before the user job is started:
    [ x"$TMPDIR" != x"" ] && cd $TMPDIR
This patch is applied to the template for the pbs.pm job manager script
in /opt/globus/setup/globus/pbs.in, which then gets translated on
startup in /opt/globus/lib/perl/Globus/GRAM/JobManager/pbs.pm 
It has till now worked fine for all LCG jobs that at NIKHEF also go
through the "old" pbs JM. The jobs don't notice the difference and
we can use shared home dirs for all VOs, provided we also have a
per-node $TMPDIR location on local disk.

The patch

  • on the nodes, make sure there's sufficient amount of space in /var/spool/pbs/tmpdir (symlink to /scratch or overmount it)
  • Check that you have a pbs-server version with the tmpdir patch. (eg, for LCG, torque-1.2.0p3-2 has it)
  • /opt/globus/setup/globus/pbs.in (patch might not work, do it manually and don't forget to make a copy of the original first!):
 *** pbs.in.orig 2005-05-20 12:56:32.000000000 +0200
 --- pbs.in      2005-05-20 12:52:05.000000000 +0200
***************
 *** 321,327 ****
        }
        print JOB "wait\n";
      }
 !     elsif($description->jobtype() eq 'multiple')
      {
        my $count = $description->count;
        my $cmd_script_url ;
 --- 321,327 ----
        }
        print JOB "wait\n";
      }
 !     elsif( ($description->jobtype() eq 'multiple') and ($description->count > 1 ) )
      {
        my $count = $description->count;
        my $cmd_script_url ;
***************
 *** 374,379 ****
 --- 374,393 ----
      }
      else
      {
 +       # this is a simple single-node job that can use $TMPDIR
 +       # unless the user has given one explicitly
 +       # refer back to JobManager.pm, but currently it seems that
 +       # $self->make_scratchdir uses "gram_scratch_" as a component
 +         if ( ( $description->directory() =~ /.*gram_scratch_.*/ ) and
 +           ( $description->host_count() <= 1 ) and
 +           ( $description->count <= 1 )
 +         ) {
 +           print JOB '# user ended in a scratch directory, reset to TMPDIR'."\n";
 +           print JOB '[ x"$TMPDIR" != x"" ] && cd $TMPDIR'."\n";
 +         } else {
 +           print JOB '# user requested this specific directory'."\n";
 +         }
 +
        print JOB $description->executable(), " $args <",
            $description->stdin(), "\n";
      }
  • From in the directory /opt/globus/setup/globus/, run ./setup-globus-job-manager-pbs (it will create a new pbs.pm)
  • you might also add to print JOB '[ x"$TMPDIR" != x"" ] && cd $TMPDIR'."\n";
    • export PBS_O_WORKDIR=$TMPDIR
    • export SCRATCH_DIRECTORY=$TMPDIR
  • to test, submit a job with /bin/pwd


Template:TracNotice