Crabpage

From T2B Wiki
Jump to navigation Jump to search

How to submit jobs with crab

To submit jobs with Crab you have to do following steps:

  • Make sure that the CMS/Crab environment is set up

Setup the Crab environment

  • login to a machine where the Crab software is installed
 e.g: 
ssh -X <tt>whoami</tt>@master2.iihe.ac.be
  • setup the environment
    • source the CMS environment
  setenv VO_CMS_SW_DIR /msa3/cmssoft
  source $VO_CMS_SW_DIR/cmsset_default.csh ## initialize the cms env
    • source the Crab environment
  source /msa3/crab/latest/crab.csh
  • next go into your project directory and run eg (maybe replace scramv1 by scram and -sh by -csh)
  eval <tt>scramv1 runtime -csh</tt>
  • Note that you can writ this commands in your .cshrc file

Adjust the crab.cfg to your specific needs

  • Go to your working directory and copy the example crab.cfg from the working directory:
  cp $CRABDIR/python/crab.cfg .
  • This file is the config for Crab
    • This file has different sections indicated with '[]'
    • [CMSSW]: CMSSW related: (number of events, cfg file, ...)
    • datasetpath = [string that describes dataset]. You can find this on DBS discovery page.
    • pset = [the name of the Pset which fits your code]. This is the config file which you use to a 'cmsRun' in an interactive run. Make sure it works before submitting to the grid
    • output_file = [The name for your output file]. Important: be consistent with the name inside your pset. Of course you can give more then one (comma separated)
    • total_number_of_events = [number of events you want to acces, -1 for all data]
    • Crab will work out the right number of evts/jobs according to the user requests: evts/job or nomber of jobs
      • events_per_job = [number of events accessed by a single job]
      • numbers_of_jobs = [number of jobs you want to run]
    • [USER]: user related: (where to store the output, ...)
    • return_data = [0|1]: Send output_file with stdout, stderr to User interface
    • copy_data = [0|1]: Copies output_file. If copy_data = 1, you have to specify storage_element = [name of the storage element to copy output files to] and storage_path = [path on the SE]. Important: crab makes the directory for you on the storage (if you have the permission), but the output file allready exists, the output file won't be written to storage.
    • [EDG]: grid related: ...

Create and submit a crab job

  • Before you can submit your job you have to prepare everything that is needed
crab -create
  • This will create a directory crab_0_<date>_
  • Once everything is created you can submit the job. By default it will submit all the jobs, specify N: number of jobs you want to submit
crab -submit <N> -c



Template:TracNotice