LocalSubmission

From T2B Wiki
Jump to navigation Jump to search

Direct submission to local queue on the T2_BE_IIHE cluster

Aim

  • The aim of this page is to provide a brief introduction how to submit to the localqueue.
  • The localqueue allows to send executable code to the Tier2 cluster.
  • This procedure can be used to run non-CMSSW code that need access to files on the Storage Element (SE) maite.iihe.ac.be.
  • It is useful to use this procedure to not overload the User Interfaces (UIs) known as the mX machines.

Procedure

  • Log in to a UI mX.iihe.ac.be; replace X with a number of choice. See policies about the policies on the UIs.
  • Make a directory and prepare an executable.
mkdir directsubmissiontest
cd directsubmissiontest/
emacs script.sh&
  • Paste following code into script.sh. (see below)
  • Due to the setup of the Tier2 the output of the script will be placed on the /localgrid partition which is mounted on both the UI's on the workernodes. Therefore you need to prepare a directory to make sure the output is stored correctly. The localgrid partition can be used as a sandbox for temporary placing input and output files. Do not store any files there permanently.
mkdir /localgrid/$USER/directsubmissiontest
  • Execute the following command to submit the script to the local queue
qsub -q localgrid@cream02 -o script.stdout -e script.stderr script.sh
  • Follow the progress of your job on the UI
qstat -u $USER localgrid@cream02


  • Your job finished if you don't see it anymore with qstat. You should now be able to find your output files in the directory you've create on localgrid
/localgrid/$USER/directsubmissiontest/script.stdout
/localgrid/$USER/directsubmissiontest/script.stderr
/localgrid/$USER/directsubmissiontest/

Comments and FAQ

  • In case you would like to access a root file you should copy it to the /scratch space on the workernode.
    • /scratch is the native disk of the workernode and is several 100 GBs big.
    • Each job is allotted a working directory that is cleaned automatically at the end of the job. This directory is store in the variable $TMPDIR
    • Your procedure should look like this:
    • copy the necessary root from /localgrid (if you have any) to $TMPDIR
    • Make sure the output of the job is also written to $TMPDIR
    • Copy your output files back to /localgrid
    • Do not read root files from /localgrid. This directory is not physically located on the workernode, it is mounted from the fileserver. Doing this will put a big load on the fileserver potentially causing the UIs to be slow.

****** IMPORTANT *******
If you use the local submission, please notice that you potentially can slow down our site. So please, copy all the files you will use during the job to /scratch to avoid this.
Many thanks,
The Admin Team

  • How to set CMSSW environment in a batch job?

Add the following lines to your script :

pwd=$PWD
source $VO_CMS_SW_DIR/cmsset_default.sh                          # make scram available                                                                                                                                                             
cd /localgrid/<USER NAME>/path/to/CMSSW_4_1_4/src/               # your local CMSSW release                                                                                                                                                         
eval `scram runtime -sh`                                         # don't use cmsenv, won't work on batch                                                                                                                                            
cd $pwd
  • How to make your proxy available during batch jobs?

Make sure you have a valid proxy and copy it to some place on /localgrid :

cp $X509_USER_PROXY /localgrid/<USER NAME>/some/place

Add the following line to your script :

export X509_USER_PROXY=/localgrid/<USER NAME>/some/place
  • How to avoid my short jobs from being blocked in the waiting queue when the site is full ?

If you intend to submit short jobs, then it is wise to specify explicitly to the batch system their estimated maximum walltime. You can do this by adding an option to the qsub command :

 qsub -q localgrid@cream02 -o script.stdout -e script.stderr -l walltime=<HH:MM:SS> script.sh
 

or by adding the following line at the beginning of your job script :

 #PBS -l walltime=<HH:MM:SS>
 

Proceeding this way, your jobs priority will grow faster as time goes by, increasing the chances of being executed first. (The shorter they are, the faster their priority will increase over the time.)

But be aware that if your jobs are running longer then the specified maximum walltime, they will be killed by the batch system. So, don't hesitate to overestimate a bit this maximum walltime.

Stop your jobs

If for some reason, you want to stop your jobs on the server, you can use this procedure:

qstat @cream02 | grep <your user name>

This will give you a list of jobs running with thier ID's. f.i.

394402.cream02            submit.sh        odevroed               0 R localgrid

Now, use the ID to kill the job with the qdel command:

qdel 394402.cream02

Your job will now be removed.

Attachments

  • script.sh
#!/bin/bash          

##Some general shell commands
STR="Hello World!"
echo $STR    
echo ">> script.sh is checking where it is"
pwd
echo ">> script.sh is checking how much disk space is still available"
df -h
echo ">> script.sh is listing files and directories in the current location"
ls -l
echo ">> script.sh is listing files and directories in userdir on storage element"
ls -l /pnfs/iihe/cms/store/user/$USER

##When accessing files on the storage element it is important to execute your code on the /scratch partition of the workernode you are running on. Therefore you need to copy your executable which is accessing/writing root files onto the /scratch partition and execute it there. This is illustrated below.

echo ">> go to TMPDIR"
cd $TMPDIR
echo ">> ls of TMPDIR partition"
ls -l

##Create a small root macro

echo "{
  //TFile *MyFile = new TFile(\"testfile.root\",\"RECREATE\"); 
  //MyFile->ls();
  //MyFile->Close(),
  TFile* f=TFile::Open(\"dcap://maite.iihe.ac.be:/pnfs/iihe/cms/store/user/$USER/testfile.root\");
  f->ls();
  f->Close();
} 
" > rootScript.C

cat rootScript.C

echo ">> set root"
##Copied a root version from /user/cmssoft into /localgrid
export ROOTSYS=/localgrid/$USER/cmssoft/root_5.26.00e_iihe_default_dcap/root 
export PATH=$PATH:$ROOTSYS/bin 
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ROOTSYS/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/lib

echo ">> execute root macro"
root -q -l -b -n rootScript.C

echo ">> ls of TMPDIR"
ls -l

echo "copy the file back to the /localgrid sandbox"
#cp testfile.root /localgrid/jmmaes/directsubmissiontest


Template:TracNotice