Olivier
Introduction
This is my personal space where I will try to centralize the knowledge needed as a grid administrator. It will mainly be a link section to the pages that detail the action to be taken. This page will serve as a central node from which this information will be accessible. For convenience, I created a list of all the pages on the SiteMap page. This is a handy overview.
#Monitoring#
A lot of processes need to be monitored. Of importance are the ganglia systems that monitor the activity of the dcache nodes. As Well in CPU as bandwidth.
Dcache
Is a complicated system to manage the automatic installation of the infrastructure like worker nodes, UI's and storage pools. Look at the DCache page for more info. Of interest is also the trouble shooting section as here the steps to be taken to debug the system are detailed.
- POOLS
A pool can be seen as a logical volume with a raid system on it. Thus, different pools can reside on one machine. At the iihe, the pools pools get the name of the machine they are physically on and then they get an _ and a number. So f.i. behar022_1 and behar022_1 both reside on behar022. On the other side, behar6 only has one pool on it: behar6_1
To check all kind of stats about the pools, visit the following pages:
transfers
dcache head page
2. NODES
Phedex
All Phedex related stuff can be found on the appropriate page
Lexicon
I found a lot of acronyms that are very esoteric to me. Therefore I make a small lexicon here. I will also try to keep this lexicon ordered alphabetically, but I make no promises :-)
BDII:: CA:: Certification Authority. In Belgium, there is but one and it is BELNET CE:: Computing Element CE/GK:: no idea up to now DBS:: The CMS Dataset Bookkeeping System (DBS) is a database and user API that indexes event-data data for the CMS Collaboration. The primary functionality is to provide cataloging by production and analysis operations and allow for data discovery by CMS physicists. To locate datasets, go to the DBS discovery page EGEE:: Enabling Grids for E-SciencE http://www.eu-egee.org/ LCG:: The LHC Computing Grid http://lcg.web.cern.ch/ NCM:: Network Configuration Management NFS:: The Network File System (NFS) is the oldest NAS (Network Attached Storage) protocol. Developed by Sun in the ’80s and made an open standard, NFS makes files on the network available anywhere. pNFS:: parallel NFS. An extension of nfs for parallel access. See for more on info at DESY PRODAGENT:: production agent: agent (worker node) that produces simulation data. see the CERN page RGMA SERVER:: no idea up to now SE:: Storage Element VO:: Virtual Organisation VOMS:: virtual organisation membership service WN:: Worker Node: pc doing all the number crunching. WMS:: Workload Management System: a task scheduler, runs on ...