<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-GB">
	<id>https://t2bwiki.iihe.ac.be/index.php?action=history&amp;feed=atom&amp;title=PBS_TMPDIR</id>
	<title>PBS TMPDIR - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://t2bwiki.iihe.ac.be/index.php?action=history&amp;feed=atom&amp;title=PBS_TMPDIR"/>
	<link rel="alternate" type="text/html" href="https://t2bwiki.iihe.ac.be/index.php?title=PBS_TMPDIR&amp;action=history"/>
	<updated>2026-04-20T09:48:15Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.43.5</generator>
	<entry>
		<id>https://t2bwiki.iihe.ac.be/index.php?title=PBS_TMPDIR&amp;diff=209&amp;oldid=prev</id>
		<title>Maintenance script: Created page with &quot; === The PBS TMPDIR patch === When using shared NFS homedirectories on a large number of nodes, heavy load on the cluster leads to overload on the central NFS-server and bad ...&quot;</title>
		<link rel="alternate" type="text/html" href="https://t2bwiki.iihe.ac.be/index.php?title=PBS_TMPDIR&amp;diff=209&amp;oldid=prev"/>
		<updated>2015-08-26T12:28:55Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot; === The PBS TMPDIR patch === When using shared NFS homedirectories on a large number of nodes, heavy load on the cluster leads to overload on the central NFS-server and bad ...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;&lt;br /&gt;
=== The PBS TMPDIR patch ===&lt;br /&gt;
When using shared NFS homedirectories on a large number of nodes, heavy load on the cluster leads to overload on the central NFS-server and bad performance. The approach of shared homedirectories is ofcourse very esay to setup and also necessary for MPI.&lt;br /&gt;
&lt;br /&gt;
with thanks to David Groep:&lt;br /&gt;
 Using the fact that TMPDIR points to a local (per-node) scratch directory,&lt;br /&gt;
 we have been applying a patch to the original pbs manager to select,&lt;br /&gt;
 based on the #nodfes requested by the job and the job type, the&lt;br /&gt;
 current working directory of the job. Patch is attached to this mail&lt;br /&gt;
 (it&amp;#039;s just a few lines).&lt;br /&gt;
 What it does:&lt;br /&gt;
*if the job is of type &amp;quot;mpi&amp;quot;, or if the type is &amp;quot;multiple&amp;quot; and the&lt;br /&gt;
   number of requested nodes &amp;gt; 1, the bevahviour of the pbs job manager&lt;br /&gt;
   is un-altered.&lt;br /&gt;
*if the job type is &amp;quot;single&amp;quot;, or the type is &amp;quot;multiple&amp;quot; and the&lt;br /&gt;
   job requests 0 or 1 nodes, the following statement is inserted&lt;br /&gt;
   in the PBS job script, just before the user job is started:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
    [ x&amp;quot;$TMPDIR&amp;quot; != x&amp;quot;&amp;quot; ] &amp;amp;&amp;amp; cd $TMPDIR&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
 This patch is applied to the template for the pbs.pm job manager script&lt;br /&gt;
 in /opt/globus/setup/globus/pbs.in, which then gets translated on&lt;br /&gt;
 startup in /opt/globus/lib/perl/Globus/GRAM/JobManager/pbs.pm &lt;br /&gt;
 It has till now worked fine for all LCG jobs that at NIKHEF also go&lt;br /&gt;
 through the &amp;quot;old&amp;quot; pbs JM. The jobs don&amp;#039;t notice the difference and&lt;br /&gt;
 we can use shared home dirs for all VOs, provided we also have a&lt;br /&gt;
 per-node $TMPDIR location on local disk.&lt;br /&gt;
==== The patch ====&lt;br /&gt;
*on the nodes, make sure there&amp;#039;s sufficient amount of space in /var/spool/pbs/tmpdir (symlink to /scratch or overmount it)&lt;br /&gt;
*Check that you have a pbs-server version with the tmpdir patch. (eg, for LCG, torque-1.2.0p3-2 has it)&lt;br /&gt;
*/opt/globus/setup/globus/pbs.in (patch might not work, do it manually and don&amp;#039;t forget to make a copy of the original first!):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 *** pbs.in.orig 2005-05-20 12:56:32.000000000 +0200&lt;br /&gt;
 --- pbs.in      2005-05-20 12:52:05.000000000 +0200&lt;br /&gt;
***************&lt;br /&gt;
 *** 321,327 ****&lt;br /&gt;
        }&lt;br /&gt;
        print JOB &amp;quot;wait\n&amp;quot;;&lt;br /&gt;
      }&lt;br /&gt;
 !     elsif($description-&amp;gt;jobtype() eq &amp;#039;multiple&amp;#039;)&lt;br /&gt;
      {&lt;br /&gt;
        my $count = $description-&amp;gt;count;&lt;br /&gt;
        my $cmd_script_url ;&lt;br /&gt;
 --- 321,327 ----&lt;br /&gt;
        }&lt;br /&gt;
        print JOB &amp;quot;wait\n&amp;quot;;&lt;br /&gt;
      }&lt;br /&gt;
 !     elsif( ($description-&amp;gt;jobtype() eq &amp;#039;multiple&amp;#039;) and ($description-&amp;gt;count &amp;gt; 1 ) )&lt;br /&gt;
      {&lt;br /&gt;
        my $count = $description-&amp;gt;count;&lt;br /&gt;
        my $cmd_script_url ;&lt;br /&gt;
***************&lt;br /&gt;
 *** 374,379 ****&lt;br /&gt;
 --- 374,393 ----&lt;br /&gt;
      }&lt;br /&gt;
      else&lt;br /&gt;
      {&lt;br /&gt;
 +       # this is a simple single-node job that can use $TMPDIR&lt;br /&gt;
 +       # unless the user has given one explicitly&lt;br /&gt;
 +       # refer back to JobManager.pm, but currently it seems that&lt;br /&gt;
 +       # $self-&amp;gt;make_scratchdir uses &amp;quot;gram_scratch_&amp;quot; as a component&lt;br /&gt;
 +         if ( ( $description-&amp;gt;directory() =~ /.*gram_scratch_.*/ ) and&lt;br /&gt;
 +           ( $description-&amp;gt;host_count() &amp;lt;= 1 ) and&lt;br /&gt;
 +           ( $description-&amp;gt;count &amp;lt;= 1 )&lt;br /&gt;
 +         ) {&lt;br /&gt;
 +           print JOB &amp;#039;# user ended in a scratch directory, reset to TMPDIR&amp;#039;.&amp;quot;\n&amp;quot;;&lt;br /&gt;
 +           print JOB &amp;#039;[ x&amp;quot;$TMPDIR&amp;quot; != x&amp;quot;&amp;quot; ] &amp;amp;&amp;amp; cd $TMPDIR&amp;#039;.&amp;quot;\n&amp;quot;;&lt;br /&gt;
 +         } else {&lt;br /&gt;
 +           print JOB &amp;#039;# user requested this specific directory&amp;#039;.&amp;quot;\n&amp;quot;;&lt;br /&gt;
 +         }&lt;br /&gt;
 +&lt;br /&gt;
        print JOB $description-&amp;gt;executable(), &amp;quot; $args &amp;lt;&amp;quot;,&lt;br /&gt;
            $description-&amp;gt;stdin(), &amp;quot;\n&amp;quot;;&lt;br /&gt;
      }&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
*From in the directory /opt/globus/setup/globus/, run  ./setup-globus-job-manager-pbs (it will create a new pbs.pm)&lt;br /&gt;
*you might also add to print &amp;lt;tt&amp;gt;JOB &amp;#039;[ x&amp;quot;$TMPDIR&amp;quot; != x&amp;quot;&amp;quot; ] &amp;amp;&amp;amp; cd $TMPDIR&amp;#039;.&amp;quot;\n&amp;quot;;&amp;lt;/tt&amp;gt;&lt;br /&gt;
**&amp;lt;tt&amp;gt;export PBS_O_WORKDIR=$TMPDIR&amp;lt;/tt&amp;gt;&lt;br /&gt;
**&amp;lt;tt&amp;gt;export SCRATCH_DIRECTORY=$TMPDIR&amp;lt;/tt&amp;gt;&lt;br /&gt;
*to test, submit a job with /bin/pwd&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{{TracNotice|{{PAGENAME}}}}&lt;/div&gt;</summary>
		<author><name>Maintenance script</name></author>
	</entry>
</feed>