AdminPage: Difference between revisions
Jump to navigation
Jump to search
(→CEPH) |
|||
Line 5: | Line 5: | ||
*[[PutClusterOn| How to properly put the cluster on]] | *[[PutClusterOn| How to properly put the cluster on]] | ||
==== CMS Services ==== | ==== CMS Services ==== | ||
*[[openID | how to use tokens and openID in the grid]] | |||
*[[Phedex]] | *[[Phedex]] | ||
*[[Heartbeat]] | *[[Heartbeat]] | ||
Line 11: | Line 12: | ||
*[[ProdAgent]] | *[[ProdAgent]] | ||
*[[GitForSiteConf| instructions to commit siteconf to git]] | *[[GitForSiteConf| instructions to commit siteconf to git]] | ||
==== Grid Configuration Issues ==== | ==== Grid Configuration Issues ==== | ||
*[[UpdateCertificates| Update the certificates of all our machines]] | *[[UpdateCertificates| Update the certificates of all our machines]] |
Revision as of 07:43, 9 May 2023
Management of the whole cluster
CMS Services
- how to use tokens and openID in the grid
- Phedex
- Heartbeat
- LoadTest
- FroNTier
- ProdAgent
- instructions to commit siteconf to git
Grid Configuration Issues
- Update the certificates of all our machines
- Issues with cream and how to solve them
- PBS TMPDIR
-
APEL(OBSOLETE) - BDII
- FTS
-
SL4 x86_64 WNs(OBSOLETE) - CE overloaded
- RB
- IPMI
-
Upgrade CA certificates(OBSOLETE) - Shutting down the cluster
- Software Area Switch
- Kernel mandatory updates for critical vulnerabilities
- Argus server and glexec on the workernodes
- Apel gap publishing
- Update IGTF CA certificates
Files section
- dCache
- Procedure for removal of old user files on pnfs
- Retrieve lost files from datasets
- Storage Consistency
- rucio commands
Status and Monitoring
- List of reserved WNs
- Todo-list
- Monitoring
- Plans/Schedule
- Grid Troubleshooting link
- Incident Reports
- How to put the software back
- What to do when a WN sends a "bad_wn.pl" email to grid_admin ?
- Nagios Installation at IIHE
- How to restart DCache
Info
- General info
- Installing CMSSW
- Installing CRAB
- System Benchmarks
- T2B Trac config info
- Hardware information
- Network Setup
- Setup Monitoring of LSI Disk Controler on Sunfire V20z Server
-
LDAP authentication system for the replication between UCL and IIHE sites(OBSOLETE) - IIHE Grid-admin survival guide
- Solaris 10
- Adding an SSD card and configuring RAID, zpools, filesystems and shares on the new Solaris fileserver
- Linux tricks for admins
-
How to implement local PBS submission with CRAB ?(OBSOLETE) -
How to create an account for a CMS user from UCL ?(OBSOLETE) -
Deploying OS errata(OBSOLETE) - Howto benchmark a node with HEPSPEC06
- Install a new dCache pool
- Backup of the users home dirs on Jefke
- Migration of mon and its Web services
- HOWTO restart a nagios test manually
- Compile and install ROOT
- Clean creamdb
- Reboot campaign for the workernodes :
- Reboot after a kernel update
- Reboot after an OS upgrade
- Force reboot a WN with hanging nfs: echo 1 > /proc/sys/kernel/sysrq ;echo b > /proc/sysrq-trigger
- Force shutdown a WN with hanging nfs: echo 1 > /proc/sys/kernel/sysrq ;echo o > /proc/sysrq-trigger
- Central management of all the admin scripts with Git
- Help page for all iihe scripts
- Configuration of a proxy for CVMFS
- How to test NFS Performance
- Alternatives to Tetex
- A new easy method to update kernel on the workernodes
- About automatic mail sending from the cluster
- T2B Trac access configuration
- Surviving to RHEL7
- Experimental : Securing profiles with Kerberos
- Migration of T2B Wiki from Trac to MediaWiki
- Message Of The Day (motd)
- Support of Long-tail of Science
- Querying BDII
- Machine private certificate with EL7
- Cluster usage accounting statistics
- Singularity container creation
Quattor
- FAQ - Aide-mémoire - Howtos
- Manage repositories with quattor
- How to build an RPM from a tag in Github
- Working in CB9 (Quattor release >= 14.2)
- How to add a new version of quattor in our scdb
- Quattor and FreeIPA
- Hard disks management
- How to use metaconfig (with examples)
- Aquilon
BEgrid wiki(OBSOLETE)-
Test things(OBSOLETE) -
Lemon installation(OBSOLETE) -
Pointersto more in-depth information on quattor(OBSOLETE) -
Addinga new machine to the cluster(OBSOLETE) Automatic generation of hardware and profile templates for new workernodes(OBSOLETE: use script create_wn)Installation of a Quattor deployment server release 13.1(OBSOLETE: see quattor template for aii server)How to add a new OS to the Quattor Repository(OBSOLETE)How to migrate workernodes from CB8 to CB9(HISTORICAL)Howto build a new pysvn on a SL63 AII server(HISTORICAL)
FreeIPA
KVM virtualization
- Virtualization of the new CREAM-CE on dom02 with KVM
- Installation of the new virtualization server dom04
- Easy creation of virtual machines
- Monitoring the KVM vHosts with Ganglia
T2B Cloud
- Transforming the KVM hypervisors farm into an OpenNebula cloud
- Working in the T2B cloud
- Migrate one DB from sqlite to mysql
- Backup of the T2B Cloud
- Dealing with iPXE
- Resizing the drive of a VM
Clouds for users
gUSE/WS-PGRADE portal
Migration to EMI-3
XEN
CEPH
SEE PRIVATE WIKI
CEPH Old (deprecated)
- Understanding Ceph
- Installing Ceph with Quattor
- Experiments with Ceph
- Operating a Ceph cluster
- Deploying a new Ceph Octopus cluster
- Mounting a RBD on a client machine
- Manage the Crush map
- Manage CephFS
Logstash / Elasticsearch / Kibana (ELK)
machine: log10 | interface | index manager
Network
- Bonding of 2 interfaces + tagging of 2 vlans on the bond (PRIV+PUB)|
- Managing the Huawei CE8850-32CQ-EI 100G switch