CMSSWDeploymentInDetail

From T2B Wiki
Revision as of 12:28, 26 August 2015 by Maintenance script (talk | contribs) (Created page with " CMSSW Deployment Home == CMSSW Deployment in Detail == ''release-installer.pl'' takes care of creating and submitting installation and removal jobs...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

CMSSW Deployment Home

CMSSW Deployment in Detail

release-installer.pl takes care of creating and submitting installation and removal jobs for sites where CMSSW releases are missing or obsolete. It fetches the output of jobs afterwards and makes sure that the tags are published correctly at the sites.

However, don't use this script directly. Instead use the script ri.sh, which takes care of your grid certificates automatically. You can avoid another additional 5 key strokes by using ri, an alias for ./ri.sh, created on sourcing CreateProxy.sh.


The ri functions

All commands are shown with the -c CEname option. With this option, the command is only executed for the site with the CEname in the argument. To run the command for all sites one after another, just leave away the -c CEname part

Submit deployment jobs

   ri -sc CEname
  1. Check whether the site is available for CMSSW deployment?
   This is the case if
      • No other deployment job is running (see further)
      • The site does not need manual attention (see further)
      • The site is no in maintenance
      • There is no tag CMSSW_X_Y_Z_processing install at the site
  If these requirements are not fulfilled the script stops
2. Check which releases are missing and which are obsolete
This is done by answering following questions:
      • Which releases are required and which are obsolete?
      • Which versions are installed at the site?
      • On which operating system does the site run the site? (CMSSW >= 3_4 requires SL5)
      • What are the manual requirements for this site? (see further)
  A list of missing and obsolete releases is displayed
3. Creating and installation and removal job
      • For obsolete releases, ri sets the tag form CMSSW_X_Y_Z to CMSSW_X_Y_Z_remove_scheduled
      • The obsolete release will only be removed on the first ri -sc CEname 100 hours after the original submission
      • For missing releases a tag CMSSW_X_Y_Z_processing_install is added to the site
      • A grid job with all installation and removal commands is send to the site
      • At the end of the job, CMSSW_X_Y_Z_processing_install tags are replaced by CMSSW_X_Y_Z, the CMSSW_X_Y_Z_remove_scheduled tags are removed
     (if the release has been removed at least)
      • The site gets a subdirectory named CEname in the directory ~/cms/release-installer/running
     As long as this directory exists, no new job can be submitted to the site


Checking the site's missing and obsolete releases

   ri -dc CEname
  executes point 1 and 2 of ri -sc CEname (see above)


Checking the status of running jobs

   ri -l


Fetching the output of deployment jobs

   ri -f
  • ~/cms/release-installer/running/CEname is moved to ~/cms/release-installer/running/done/CEname where the output is stored
  • If the job was not successful (exit code not 0) ~/cms/release-installer/done/CEname is copied to ~/cms/release-installer/running/done/CEname
  As long as this last directory exists, no jobs can be submitted to this site. On ri -sc CEname the message "site needs manual intervention" will be displayed

The -f can be combined with the -s option:

   ri -fs

Then, directly after fetching the output of the finished jobs, a new submission round will start.


Stealing others' deployment jobs

   -6 otheruser
  Monitoring or fetching jobs submitted by others is not allowed by default.
It can be done using the option -6 (666 = EVIL), stealing actually the proxy of the job submitter.

Monitoring the overall deployment status

The most important tools to monitor the deployment status

  • ri -d shows which releases are missing/obsolete at all sites(see above)
  • ri -l shows at which sites jobs are running (see above)
  • the sam pages (Service Availability Monitoring)
  • the sam dashboard
  a nice overview of the sam pages
  • releases.html
   ./publish_to_html.pl ~/.globus_<YourUserName>/cert.p12 releases.html
  This script will create a very convenient overview of the deployment status, showing for each site which releases are missing, how old the latest sam test is and which operating system is installed.
You want to have the correct lay out, so download the .css file to the same directory as releases.html
   wget http://www.desy.de/~wbehrenh/releases.css .
  • check the published tags at a site
   lcg-tags --vo cms --ce CEname --list 
  • plot the performance of the CMSSW installation
   ./showtimes.pl CMSSW_X_Y_Z > plots/CMSSW_X_Y_Z.dat
   ./showtimes.pl CMSSW_A_B_C > plots/CMSSW_A_B_C.dat
   cd plots
   root -b -q -l 'plottimes.C+("CMSSW_X_Y_Z,CMSSW_A_B_C","lukas.png")'
   

Editing Tags

  • add tags for releases CMSSW_A_B_C and CMSSW_D_E_F at site CEname:
   lcg-tags --vo cms --ce CEname --add --tags VO-cms-CMSSW_A_B_C,VO-cms-CMSSW_D_E_F  
  • remove tags for releases CMSSW_A_B_C and CMSSW_D_E_F site CEname
   lcg-tags --vo cms --ce CEname --remove --tags VO-cms-CMSSW_A_B_C,VO-cms-CMSSW_D_E_F

Asking site administrators for support

Open a savannah ticket

  • login
  • go to my groups
  • on the page for the appropriate group there is a link to open a new ticket
      • Category: Facility Operations
      • Assigned to: there should be an entry like cmscompinfrasup-<sitename>
      • Use GGUS: no
      • Site: site
      • Summary: drop "central CMSSW installation", the site's name "T1_CO_BLA", and the problem
      • Add "cmscompinfrasup-cmsswdeploy" to "Mail Notification Carbon-Copy List"





Template:TracNotice