Crab
Instructions for IIHE users
IIHE specific code
We have modified some of the base code in CRAB releases and it's available as CRAB_X_Y_Z_IIHE.
Don't forget this if you are running on an other-than-usual UI.
Initialise environment
It is important to know that the old framework CMS software is no longer supported on newest CRAB releases. If you still need it, make sure to setup the correct environment.
- one has first to setup the CMS+CRAB environment. do this by sourcing either
source /msa3/crab/start.sh source /msa3/crab/start.csh source /msa3/crab/start_oldfm.sh (old framework) source /msa3/crab/start_oldfm.csh (old framework)
- in case something goes wrong (or you want a specific version), try the following
## setup base CMS-soft parameter export VO_CMS_SW_DIR=/msa3/cmssoft ## setup CMS-soft environment source $VO_CMS_SW_DIR/cmsset_default.sh ## the directory latest is actually a symlink, pointing to the latest stable version ## feel free to try another one in that directory (or latest_oldfm) source /msa3/crab/latest/crab.sh
- next go into your project directory and run eg (maybe replace scramv1 by scram and -sh by -csh)
eval <tt>scramv1 runtime -sh</tt>
- go back to your working directory and copy the exmple crab.cfg from the working directory:
cp $CRABDIR/python/crab.cfg .
- modify the values in this crab.cfg file
- create the jobs;
- create all jobs: crab -create all
- create the first 5 jobs: crab -create 5
- submit the jobs: crab -submit all -c
CMSSW
crab.cfg
In crab.cfg, set the correct jobtype
[CRAB] jobtype = cmssw
Jobs without datasets
When a user wants to do very basic MC production, (s)he can use CRAB for this also. The configuration is identical as any other analysis job, the only thing to do is to use no dataset. In crab.cfg use
[CMSSW] datasetpath=None
This will only provide basic MC. For anything more complex ProdAgent should be used.
No output from executable
Sometimes the output of the executable can cause major troubles (mostly due to the huge size of the file). There's an option to disable all output generated by the executable to prevent this. In crab.cfg do
[USER] job_output = no
Disable dcache readahaed
You normally should not need this! Inform the grid admins in case you run into this.
[USER] dcache_ra = off
Run adaptscript
It is possible that not all features you want are properly handled with CRAB. You should always mention this on the CRAB mailing list to see what the developers suggest as a solution. If you are waiting for a fix by the CRAB team, you can use this option top run a script that can eg modify whatever parameter in the cmsRun .cfg file.
[USER] adaptscript = /full/path/to/script
Tips:
- the config file for the executable is called pset.cfg at the time the script runs. Make sure it is available again after your script runs!
FAMOS
(needs crab_oldfm!)
crab.cfg
In crab.cfg, set the correct jobtype
[CRAB] jobtype = famos
Prepare ntpl-data for FAMOS
CRAB and FAMOS expect the used ntpls to be available through the standard grid filecatalogue (LFC). The file catalogue is just that: a catalogue. It keeps track of a logical filename (LFN) and the physical filename (PFN) (and it's replicas).
CRAB+FAMOS does the following to retrieve the ntpl. eg (from FAMOS.sh):
the_ntuple=su05_pyt_lm6_$NJob.ntpl input_lfn=georgia/$the_ntuple lcg-cp --vo $VO lfn:$input_lfn file:<tt>pwd</tt>/$the_ntuple
in crab.cfg this is configured as:
### LFN of the input file registered into the LFC catalog input_lfn = georgia/su05_pyt_lm6.ntpl ## LFC catalog parameters lcg_catalog_type = lfc lfc_host = lfc-cms-test.cern.ch lfc_home = /grid/cms
So what does the user need to do
- think of a logical filestructure (eg like a normal directory structure)
- create the ntpl according to this structure: ie <name>_<job_nr>.ntpl
- choose your catalog host:
export LFC_HOST=<name of the catalog-host>
eg CMS: export LFC_HOST=lfc-cms-test.cern.ch eg BECMS: export LFC_HOST=laranja.iihe.ac.be
- create the logical structure in the catalog (it's of the form /grid/<VO>/something_you_choose):
lfc-mkdir -p <some substructure that unique for you>
eg CMS:lfc-mkdir -p /grid/cms/stdweird/test-dir eg BECMS:lfc-mkdir -p /grid/becms/stdweird/other-test-dir
HINT: you can specify LFC_HOME to shorten the lfn: eg BECMS: export LFC_HOME=/grid/becms/stdweird;lfc-mkdir other-test-dir
- put the ntpl files somewhere using lcg-cp:
lcg-cp --vo <VO> -d <the name of the storage-element> -l <the LFN> file:<FULL PATH to the actual file>
eg CMS: lcg-cr --vo cms -d castorsc.grid.sinica.edu.tw -l lfn:/grid/cms/stdweird/test-dir/test-file_1.ntpl file:pwd/test-file eg BECMS: lcg-cr --vo becms -d maite.iihe.ac.be -l lfn:/grid/becms/stdweird/other-test-dir/test-file_1.ntpl file:pwd/test-file
- in the crab.cfg file (input_lfn is the filename WITHOUT the _$NJob!! FAMOS.sh adds the appropriate numbers.):
[FAMOS] input_lfn=<your lfn starting from lfc_home> [EDG] lcg_catalog_type = lfc lfc_host = <your LFC_HOST> lfc_home = <your LFC_HOME>
ORCA
(needs crab_oldfm!)
crab.cfg
In crab.cfg, set the correct jobtype
[CRAB] jobtype = orca
General items in crab.cfg
BEcms support
And if you want to submit your jobs to a different VO than CMS. Set the following parameter to e.g. becms
[EDG] virtual_organization = becms
White/Blacklists
Normally, using a default job, you have a number of sites to run on. To see these sites, go to the crab_0_..../job dir and do a edg-job-list-match on the .jdl files.
Sometimes, users can have additional information that is unknown to the grid information system that can influence the job execution in a large way. For this CRAB has a notion of white- and blacklists. In crab.cfg these should be put in eg
[EDG] ce_white_list = ce_black_list =
Users should put part of the sites CE Fully Qualified Domain Name (FQDN) there, so that there's a match between the provided value and the FQDN of the site. Eg. the FQDN of the IIHEs CE is gridce.iihe.ac.be, so putting ce_white_list = iihe will match the site.
Links