Crab

From T2B Wiki
Jump to navigation Jump to search

Instructions for IIHE users

IIHE specific code

We have modified some of the base code in CRAB releases and it's available as CRAB_X_Y_Z_IIHE.

 Don't forget this if you are running on an other-than-usual UI.

Initialise environment

It is important to know that the old framework CMS software is no longer supported on newest CRAB releases. If you still need it, make sure to setup the correct environment.

  • one has first to setup the CMS+CRAB environment. do this by sourcing either
  source /msa3/crab/start.sh 
  source /msa3/crab/start.csh

  source /msa3/crab/start_oldfm.sh (old framework)
  source /msa3/crab/start_oldfm.csh (old framework)
    • in case something goes wrong (or you want a specific version), try the following
  ## setup base CMS-soft parameter
  export VO_CMS_SW_DIR=/msa3/cmssoft
  ## setup CMS-soft environment
  source $VO_CMS_SW_DIR/cmsset_default.sh
  ## the directory latest is actually a symlink, pointing to the latest stable version
  ## feel free to try another one in that directory (or latest_oldfm)
  source /msa3/crab/latest/crab.sh 
  • next go into your project directory and run eg (maybe replace scramv1 by scram and -sh by -csh)
  eval <tt>scramv1 runtime -sh</tt>
  • go back to your working directory and copy the exmple crab.cfg from the working directory:
  cp $CRABDIR/python/crab.cfg .
  • modify the values in this crab.cfg file
  • create the jobs;
    • create all jobs: crab -create all
    • create the first 5 jobs: crab -create 5
  • submit the jobs: crab -submit all -c

CMSSW

crab.cfg

In crab.cfg, set the correct jobtype

 [CRAB]
 jobtype = cmssw

Jobs without datasets

When a user wants to do very basic MC production, (s)he can use CRAB for this also. The configuration is identical as any other analysis job, the only thing to do is to use no dataset. In crab.cfg use

  [CMSSW]
  datasetpath=None

This will only provide basic MC. For anything more complex ProdAgent should be used.

No output from executable

Sometimes the output of the executable can cause major troubles (mostly due to the huge size of the file). There's an option to disable all output generated by the executable to prevent this. In crab.cfg do

  [USER]
  job_output = no

Disable dcache readahaed

You normally should not need this! Inform the grid admins in case you run into this.

  [USER]
  dcache_ra = off

Run adaptscript

It is possible that not all features you want are properly handled with CRAB. You should always mention this on the CRAB mailing list to see what the developers suggest as a solution. If you are waiting for a fix by the CRAB team, you can use this option top run a script that can eg modify whatever parameter in the cmsRun .cfg file.

  [USER]
  adaptscript = /full/path/to/script

Tips:

  • the config file for the executable is called pset.cfg at the time the script runs. Make sure it is available again after your script runs!

FAMOS

(needs crab_oldfm!)

crab.cfg

In crab.cfg, set the correct jobtype

 [CRAB]
 jobtype = famos

Prepare ntpl-data for FAMOS

CRAB and FAMOS expect the used ntpls to be available through the standard grid filecatalogue (LFC). The file catalogue is just that: a catalogue. It keeps track of a logical filename (LFN) and the physical filename (PFN) (and it's replicas).

CRAB+FAMOS does the following to retrieve the ntpl. eg (from FAMOS.sh):

  the_ntuple=su05_pyt_lm6_$NJob.ntpl
  input_lfn=georgia/$the_ntuple
  lcg-cp --vo $VO lfn:$input_lfn file:<tt>pwd</tt>/$the_ntuple 

in crab.cfg this is configured as:

  ### LFN of the input file registered into the LFC catalog
  input_lfn = georgia/su05_pyt_lm6.ntpl
  ## LFC catalog parameters
  lcg_catalog_type = lfc
  lfc_host = lfc-cms-test.cern.ch
  lfc_home = /grid/cms

So what does the user need to do

  • think of a logical filestructure (eg like a normal directory structure)
  • create the ntpl according to this structure: ie <name>_<job_nr>.ntpl
  • choose your catalog host:
  export LFC_HOST=<name of the catalog-host>
 eg CMS: export LFC_HOST=lfc-cms-test.cern.ch
 eg BECMS: export LFC_HOST=laranja.iihe.ac.be
  • create the logical structure in the catalog (it's of the form /grid/<VO>/something_you_choose):
  lfc-mkdir -p <some substructure that unique for you>
 eg CMS:lfc-mkdir -p /grid/cms/stdweird/test-dir
 eg BECMS:lfc-mkdir -p /grid/becms/stdweird/other-test-dir
 HINT: you can specify LFC_HOME to shorten the lfn:
 eg BECMS: export LFC_HOME=/grid/becms/stdweird;lfc-mkdir other-test-dir
  • put the ntpl files somewhere using lcg-cp:
  lcg-cp --vo <VO> -d <the name of the storage-element> -l <the LFN> file:<FULL PATH to the actual file>
 eg CMS: lcg-cr --vo cms -d castorsc.grid.sinica.edu.tw -l lfn:/grid/cms/stdweird/test-dir/test-file_1.ntpl file:pwd/test-file
 eg BECMS: lcg-cr --vo becms -d maite.iihe.ac.be -l lfn:/grid/becms/stdweird/other-test-dir/test-file_1.ntpl file:pwd/test-file
  • in the crab.cfg file (input_lfn is the filename WITHOUT the _$NJob!! FAMOS.sh adds the appropriate numbers.):
  [FAMOS]
  input_lfn=<your lfn starting from lfc_home>
  [EDG]
  lcg_catalog_type = lfc
  lfc_host = <your LFC_HOST>
  lfc_home = <your LFC_HOME>

ORCA

(needs crab_oldfm!)

crab.cfg

In crab.cfg, set the correct jobtype

 [CRAB]
 jobtype = orca

General items in crab.cfg

BEcms support

And if you want to submit your jobs to a different VO than CMS. Set the following parameter to e.g. becms

 [EDG]
 virtual_organization = becms

White/Blacklists

Normally, using a default job, you have a number of sites to run on. To see these sites, go to the crab_0_..../job dir and do a edg-job-list-match on the .jdl files.

Sometimes, users can have additional information that is unknown to the grid information system that can influence the job execution in a large way. For this CRAB has a notion of white- and blacklists. In crab.cfg these should be put in eg

  [EDG]
  ce_white_list = 
  ce_black_list =

Users should put part of the sites CE Fully Qualified Domain Name (FQDN) there, so that there's a match between the provided value and the FQDN of the site. Eg. the FQDN of the IIHEs CE is gridce.iihe.ac.be, so putting ce_white_list = iihe will match the site.

Links


Template:TracNotice