LocalSubmission

From T2B Wiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Direct submission to local queue on the T2_BE_IIHE cluster

You first need to have read the Cluster_Presentation. Only then should you try to follow this page.

Aim

  • The aim of this page is to provide a brief introduction/workbook on how to submit to the local cluster.
  • The batch system allows to send executable code to the T2B cluster.
  • This procedure can be used to run any code, even those needing access to files on the Storage Element (SE) maite.iihe.ac.be or directly through /pnfs.
  • It is useful to use this procedure to not overload the User Interfaces (UIs, = mX machines).

Procedure

  • Log in to an UI (m1, m8, m9)
ssh m1.iihe.ac.be
  • Make a directory and prepare an executable.
mkdir directsubmissiontest
cd directsubmissiontest/
emacs script.sh&
  • Paste following code into script.sh. (see Attachemnt section below)
  • Execute the following command to submit the script to the local queue
qsub -o script.stdout -e script.stderr script.sh
  • Follow the progress of your job on the UI
qstat -u $USER
  • Your job are finished if you don't see it anymore with qstat. You should now be able to find your output files in the directory you've created:
> ls /user/$USER/directsubmissiontest
    script.stdout script.stderr

More details: some comments and FAQ

Comments

  • In case you would like to access a root file you should copy it to the $TMPDIR (=/scratch/jobid.cream02.ac.be/) space on the workernode unique to each job.
    • /scratch is the native disk of the workernode and is several 100 GBs big.
    • Each job is allotted a working directory that is cleaned automatically at the end of the job. This directory is stored in the variable $TMPDIR
    • Do not read root files from /user. This directory is not physically located on the workernode, it is mounted from the fileserver. Doing this will put a big load on the fileserver potentially causing the UIs to be slow.


****** IMPORTANT *******
If you use the local submission, please notice that you potentially can slow down our site. So please, copy all the files you will use during the job to $TMPDIR to avoid this.

dccp dcap://maite.iihe.ac.be/pnfs/iihe/..../MYFILE $TMPDIR/

FAQ

How to set CMSSW environment in a batch job

Add the following lines to your script :

pwd=$PWD
source $VO_CMS_SW_DIR/cmsset_default.sh                          # make scram available                                                                                                                                                             
cd /user/$USER/path/to/CMSSW_X_Y_Z/src/                          # your local CMSSW release                                                                                                                                                         
eval `scram runtime -sh`                                         # don't use cmsenv, won't work on batch                                                                                                                                            
cd $pwd
How to make your proxy available during batch jobs (for instance to write to /pnfs)
  • Create a proxy with long validity time:
voms-proxy-init --voms MYEXPERIMENT --valid 192:0
   MYEXPERIMENT is one of cms, icecube, solidexperiment.org, beapps, ...
  • Copy it to /user
cp $X509_USER_PROXY /user/$USER/
  • In your sh script you send to qsub, add the line:
export X509_USER_PROXY=/user/$USER/x509up_u$(id -u $USER)    # Or the name of the proxy you copied before if you changed the name
  • Then technically, to copy a file made in your job in the /scratch area, you just do:
gfal-copy file://$TMPDIR/MYFILE srm://maite.iihe.ac.be:8443/pnfs/iihe/MY/DIR/MYFILE

Stop your jobs

If for some reason, you want to stop your jobs on the server, you can use this procedure: qstat -u $USER

This will give you a list of jobs running with thier ID's. f.i.

394402.cream02            submit.sh        odevroed               0 R localgrid

Now, use the ID to kill the job with the qdel command: qdel 394402.cream02

Your job will now be removed.

Attachments

  • script.sh
#!/bin/bash          

##Some general shell commands
STR="Hello World!"
echo $STR    
echo ">> script.sh is checking where it is"
pwd
echo ">> script.sh is checking how much disk space is still available"
df -h
echo ">> script.sh is listing files and directories in the current location"
ls -l
echo ">> script.sh is listing files and directories in userdir on storage element"
ls -l /pnfs/iihe/cms/store/user/$USER

echo ">> go to TMPDIR"
cd $TMPDIR
echo ">> ls of TMPDIR partition"
ls -l

##Create a small root macro

echo "{
  //TFile *MyFile = new TFile(\"testfile.root\",\"RECREATE\"); 
  //MyFile->ls();
  //MyFile->Close(),
  TFile* f=TFile::Open(\"dcap://maite.iihe.ac.be/pnfs/iihe/cms/store/user/$USER/testfile.root\");
  f->ls();
  f->Close();
} 
" > rootScript.C

cat rootScript.C


echo ">> execute root macro"
root -q -l -b -n rootScript.C

echo ">> ls of TMPDIR"
ls -l