LocalSubmission
Direct submission to local queue on the T2_BE_IIHE cluster
Aim
- The aim of this page is to provide a brief introduction how to submit to the localqueue.
- The localqueue allows to send executable code to the Tier2 cluster.
- This procedure can be used to run non-CMSSW code that need access to files on the Storage Element (SE) maite.iihe.ac.be.
- It is useful to use this procedure to not overload the User Interfaces (UIs) known as the mX machines.
Procedure
- Log in to a UI mX.iihe.ac.be; replace X with a number of choice. See policies about the policies on the UIs.
- Make a directory and prepare an executable.
mkdir directsubmissiontest cd directsubmissiontest/ emacs script.sh&
- Paste following code into script.sh. (see below)
- Execute the following command to submit the script to the local queue
qsub -q localgrid@cream02 -o script.stdout -e script.stderr script.sh
- Follow the progress of your job on the UI
qstat -u $USER localgrid@cream02
- Your job are finished if you don't see it anymore with qstat. You should now be able to find your output files in the directory you've create:
/user/$USER/directsubmissiontest/script.stdout /user/$USER/directsubmissiontest/script.stderr /user/$USER/directsubmissiontest/
Comments and FAQ
- In case you would like to access a root file you should copy it to the /scratch space on the workernode.
- /scratch is the native disk of the workernode and is several 100 GBs big.
- Each job is allotted a working directory that is cleaned automatically at the end of the job. This directory is store in the variable $TMPDIR
- Your procedure should look like this:
- copy the necessary root from /localgrid (if you have any) to $TMPDIR
- Make sure the output of the job is also written to $TMPDIR
- Copy your output files back to /localgrid
- Do not read root files from /localgrid. This directory is not physically located on the workernode, it is mounted from the fileserver. Doing this will put a big load on the fileserver potentially causing the UIs to be slow.
****** IMPORTANT *******
If you use the local submission, please notice that you potentially can slow down our site. So please, copy all the files you will use during the job to /scratch to avoid this.
Many thanks,
The Admin Team
- How to set CMSSW environment in a batch job?
Add the following lines to your script :
pwd=$PWD source $VO_CMS_SW_DIR/cmsset_default.sh # make scram available cd /localgrid/<USER NAME>/path/to/CMSSW_4_1_4/src/ # your local CMSSW release eval `scram runtime -sh` # don't use cmsenv, won't work on batch cd $pwd
- How to make your proxy available during batch jobs?
Make sure you have a valid proxy and copy it to some place on /localgrid :
cp $X509_USER_PROXY ~/myProxy
note that myProxy needs to be a file.
Add the following line to your script :
export X509_USER_PROXY=~/myProxy
- How to avoid my short jobs from being blocked in the waiting queue when the site is full ?
If you intend to submit short jobs, then it is wise to specify explicitly to the batch system their estimated maximum walltime. You can do this by adding an option to the qsub command :
qsub -q localgrid@cream02 -o script.stdout -e script.stderr -l walltime=<HH:MM:SS> script.sh
or by adding the following line at the beginning of your job script :
#PBS -l walltime=<HH:MM:SS>
Proceeding this way, your jobs priority will grow faster as time goes by, increasing the chances of being executed first. (The shorter they are, the faster their priority will increase over the time.)
But be aware that if your jobs are running longer then the specified maximum walltime, they will be killed by the batch system. So, don't hesitate to overestimate a bit this maximum walltime.
Stop your jobs
If for some reason, you want to stop your jobs on the server, you can use this procedure:
qstat @cream02 | grep <your user name>
This will give you a list of jobs running with thier ID's. f.i.
394402.cream02 submit.sh odevroed 0 R localgrid
Now, use the ID to kill the job with the qdel command:
qdel 394402.cream02
Your job will now be removed.
Attachments
- script.sh
#!/bin/bash ##Some general shell commands STR="Hello World!" echo $STR echo ">> script.sh is checking where it is" pwd echo ">> script.sh is checking how much disk space is still available" df -h echo ">> script.sh is listing files and directories in the current location" ls -l echo ">> script.sh is listing files and directories in userdir on storage element" ls -l /pnfs/iihe/cms/store/user/$USER ##When accessing files on the storage element it is important to execute your code on the /scratch partition of the workernode you are running on. Therefore you need to copy your executable which is accessing/writing root files onto the /scratch partition and execute it there. This is illustrated below. echo ">> go to TMPDIR" cd $TMPDIR echo ">> ls of TMPDIR partition" ls -l ##Create a small root macro echo "{ //TFile *MyFile = new TFile(\"testfile.root\",\"RECREATE\"); //MyFile->ls(); //MyFile->Close(), TFile* f=TFile::Open(\"dcap://maite.iihe.ac.be/pnfs/iihe/cms/store/user/$USER/testfile.root\"); f->ls(); f->Close(); } " > rootScript.C cat rootScript.C echo ">> set root" ##Copied a root version from /user/cmssoft into /localgrid export ROOTSYS=/localgrid/$USER/cmssoft/root_5.26.00e_iihe_default_dcap/root export PATH=$PATH:$ROOTSYS/bin export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ROOTSYS/lib export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/lib echo ">> execute root macro" root -q -l -b -n rootScript.C echo ">> ls of TMPDIR" ls -l echo "copy the file back to the /localgrid sandbox" #cp testfile.root /localgrid/jmmaes/directsubmissiontest