AdminPage: Difference between revisions
Jump to navigation
Jump to search
(34 intermediate revisions by 6 users not shown) | |||
Line 5: | Line 5: | ||
*[[PutClusterOn| How to properly put the cluster on]] | *[[PutClusterOn| How to properly put the cluster on]] | ||
==== CMS Services ==== | ==== CMS Services ==== | ||
*[[OpenID | How to use tokens and openID in the grid]] | |||
*[[Phedex]] | *[[Phedex]] | ||
*[[Heartbeat]] | *[[Heartbeat]] | ||
Line 10: | Line 11: | ||
*[[FroNTier]] | *[[FroNTier]] | ||
*[[ProdAgent]] | *[[ProdAgent]] | ||
*[[GitForSiteConf| | *[[GitForSiteConf| Instructions to commit siteconf to git]] | ||
==== Grid Configuration Issues ==== | ==== Grid Configuration Issues ==== | ||
*[[UpdateCertificates| Update the certificates of all our machines]] | *[[UpdateCertificates| Update the certificates of all our machines]] | ||
Line 41: | Line 43: | ||
*[[GetLostFiles| Retrieve lost files from datasets]] | *[[GetLostFiles| Retrieve lost files from datasets]] | ||
*[[StorageConsistency| Storage Consistency]] | *[[StorageConsistency| Storage Consistency]] | ||
*[[Rucio | rucio commands ]] | |||
==== Status and Monitoring ==== | ==== Status and Monitoring ==== | ||
Line 81: | Line 84: | ||
**[[KernelUpdate| Reboot after a kernel update]] | **[[KernelUpdate| Reboot after a kernel update]] | ||
**[[UpgradeWNstoSL5.5| Reboot after an OS upgrade]] | **[[UpgradeWNstoSL5.5| Reboot after an OS upgrade]] | ||
** Force reboot a WN with hanging nfs: echo 1 > /proc/sys/kernel/sysrq ;echo b > /proc/sysrq-trigger | |||
** Force shutdown a WN with hanging nfs: echo 1 > /proc/sys/kernel/sysrq ;echo o > /proc/sysrq-trigger | |||
*[[ManageAllAdminScriptsWithGit| Central management of all the admin scripts with Git]] | *[[ManageAllAdminScriptsWithGit| Central management of all the admin scripts with Git]] | ||
*[[HelpPageForAllScripts|Help page for all iihe scripts]] | |||
*[[ConfigProxyCvmfs| Configuration of a proxy for CVMFS]] | *[[ConfigProxyCvmfs| Configuration of a proxy for CVMFS]] | ||
**[[RecoverCvmfs| How to recover CVMFS]] | **[[RecoverCvmfs| How to recover CVMFS]] | ||
Line 94: | Line 100: | ||
*[[motd|Message Of The Day (motd)]] | *[[motd|Message Of The Day (motd)]] | ||
*[[LToS| Support of Long-tail of Science]] | *[[LToS| Support of Long-tail of Science]] | ||
*[[QueryingBDII| Querying BDII]] | |||
*[[MachinePrivateCertWithEL7| Machine private certificate with EL7]] | |||
*[[ClusterUsageAccountingStatistics| Cluster usage accounting statistics]] | |||
*[[SingularityContainerCreation | Singularity container creation]] | |||
*[[ExplainingApel | Explaining the Apel accounting system]] | |||
==== Quattor ==== | ==== Quattor ==== | ||
*[[AideMemoire| FAQ - Aide-mémoire - Howtos]] | |||
*[[ManageRepositoriesWithQuattor|Manage repositories with quattor]] | |||
*[[GenerateRPMFromATagInGithub| How to build an RPM from a tag in Github]] | |||
*[[WorkingInCB9| Working in CB9 (Quattor release >= 14.2)]] | |||
*[[AddNewQuattorVersion|How to add a new version of quattor in our scdb]] | |||
*[[QuattorFreeIPA| Quattor and FreeIPA]] | |||
*[[HardDisksManagement| Hard disks management]] | |||
*[[Metaconfig|How to use metaconfig (with examples)]] | |||
*[[Aquilon| Aquilon]] | |||
*[http://quattor.begrid.be/trac/centralised-begrid-v5/wiki/BEgridAndQuattor <strike>BEgrid wiki</strike>(OBSOLETE)] | *[http://quattor.begrid.be/trac/centralised-begrid-v5/wiki/BEgridAndQuattor <strike>BEgrid wiki</strike>(OBSOLETE)] | ||
*[[Test_things| <strike>Test things</strike>(OBSOLETE)]] | *[[Test_things| <strike>Test things</strike>(OBSOLETE)]] | ||
Line 101: | Line 124: | ||
*[[QuattorPointers| <strike>Pointers</strike>]]<strike> to more in-depth information on quattor</strike>(OBSOLETE) | *[[QuattorPointers| <strike>Pointers</strike>]]<strike> to more in-depth information on quattor</strike>(OBSOLETE) | ||
*[[AddingMachineToCluster| <strike>Adding</strike>]]<strike> a new machine to the cluster</strike>(OBSOLETE) | *[[AddingMachineToCluster| <strike>Adding</strike>]]<strike> a new machine to the cluster</strike>(OBSOLETE) | ||
*[[AutomaticMachineTemplateGeneration| Automatic generation of hardware and profile templates for new workernodes]] | *[[AutomaticMachineTemplateGeneration|<strike>Automatic generation of hardware and profile templates for new workernodes</strike>]](OBSOLETE: use script create_wn) | ||
*[[InstallationBEgridClient0| Installation of a Quattor deployment server release 13.1]] | *[[InstallationBEgridClient0|<strike>Installation of a Quattor deployment server release 13.1</strike>]](OBSOLETE: see quattor template for aii server) | ||
*[[InstallFilesNewOS| How to add a new OS to the Quattor Repository]] | *[[InstallFilesNewOS|<strike> How to add a new OS to the Quattor Repository</strike>]](OBSOLETE) | ||
*[[HowtoMigrateWNToCB9|<strike>How to migrate workernodes from CB8 to CB9</strike>]](HISTORICAL) | |||
*[[HowtoMigrateWNToCB9| How to migrate workernodes from CB8 to CB9]] | *[[BuildANewPysvnOnAiiServer|<strike>Howto build a new pysvn on a SL63 AII server</strike>]](HISTORICAL) | ||
==== FreeIPA ==== | |||
*[[BuildANewPysvnOnAiiServer| Howto build a new pysvn on a SL63 AII server]] | *[[FixIPAcert|Fix IPA client certificates]] | ||
*[[ | |||
==== KVM virtualization ==== | ==== KVM virtualization ==== | ||
Line 124: | Line 143: | ||
*[[WorkingInT2BCloud| Working in the T2B cloud]] | *[[WorkingInT2BCloud| Working in the T2B cloud]] | ||
*[[MigrateDBMySQL| Migrate one DB from sqlite to mysql]] | *[[MigrateDBMySQL| Migrate one DB from sqlite to mysql]] | ||
*[[BackupT2BCloud| Backup of the T2B Cloud]] | |||
*[[DealingWithiPXE| Dealing with iPXE]] | |||
*[[ResizingVMDisk| Resizing the drive of a VM]] | |||
*[[RestoringCloudFrontendFromBackup| Restoring an OpenNebula frontend from a backup]] | |||
==== Clouds for users ==== | |||
*[[VUB-ULB cloud]] | |||
*[[BEgrid cloud (part of FedCloud)]] | |||
==== gUSE/WS-PGRADE portal ==== | ==== gUSE/WS-PGRADE portal ==== | ||
Line 139: | Line 166: | ||
==== CEPH ==== | ==== CEPH ==== | ||
SEE PRIVATE WIKI | |||
==== CEPH Old (deprecated) ==== | |||
*[[UnderstandingCeph| Understanding Ceph]] | *[[UnderstandingCeph| Understanding Ceph]] | ||
*[[InstallCephWithQuattor| Installing Ceph with Quattor]] | *[[InstallCephWithQuattor| Installing Ceph with Quattor]] | ||
*[[ExperimentsWithCeph| Experiments with Ceph]] | *[[ExperimentsWithCeph| Experiments with Ceph]] | ||
*[[CephBasics| Operating a Ceph cluster]] | *[[CephBasics| Operating a Ceph cluster]] | ||
*[[Deploying_a_new_Ceph_Octopus_cluster| Deploying a new Ceph Octopus cluster]] | |||
*[[Mounting_a_RBD_on_a_client_machine | Mounting a RBD on a client machine]] | |||
*[[CephCrushMap | Manage the Crush map]] | |||
*[[CephFS | Manage CephFS]] | |||
==== Logstash / Elasticsearch / Kibana (ELK) ==== | ==== Logstash / Elasticsearch / Kibana (ELK) ==== | ||
Line 152: | Line 186: | ||
==== Network ==== | ==== Network ==== | ||
* [[network_bond_and_tag|Bonding of 2 interfaces + tagging of 2 vlans on the bond (PRIV+PUB)|]] | * [[network_bond_and_tag|Bonding of 2 interfaces + tagging of 2 vlans on the bond (PRIV+PUB)|]] | ||
* [[huawei_switch|Managing the Huawei CE8850-32CQ-EI 100G switch]] | |||
==== HTCondor clusters ==== | |||
* [[htc_test_local|Testing local submission]] | |||
* [[htc_test_grid|Testing grid submission]] | |||
* [[htc_cheat_sheet|HTCondor cheat sheet]] | |||
* [[htc_python_binding|HTCondor Python binding]] |
Latest revision as of 12:09, 25 July 2024
Management of the whole cluster
CMS Services
- How to use tokens and openID in the grid
- Phedex
- Heartbeat
- LoadTest
- FroNTier
- ProdAgent
- Instructions to commit siteconf to git
Grid Configuration Issues
- Update the certificates of all our machines
- Issues with cream and how to solve them
- PBS TMPDIR
-
APEL(OBSOLETE) - BDII
- FTS
-
SL4 x86_64 WNs(OBSOLETE) - CE overloaded
- RB
- IPMI
-
Upgrade CA certificates(OBSOLETE) - Shutting down the cluster
- Software Area Switch
- Kernel mandatory updates for critical vulnerabilities
- Argus server and glexec on the workernodes
- Apel gap publishing
- Update IGTF CA certificates
Files section
- dCache
- Procedure for removal of old user files on pnfs
- Retrieve lost files from datasets
- Storage Consistency
- rucio commands
Status and Monitoring
- List of reserved WNs
- Todo-list
- Monitoring
- Plans/Schedule
- Grid Troubleshooting link
- Incident Reports
- How to put the software back
- What to do when a WN sends a "bad_wn.pl" email to grid_admin ?
- Nagios Installation at IIHE
- How to restart DCache
Info
- General info
- Installing CMSSW
- Installing CRAB
- System Benchmarks
- T2B Trac config info
- Hardware information
- Network Setup
- Setup Monitoring of LSI Disk Controler on Sunfire V20z Server
-
LDAP authentication system for the replication between UCL and IIHE sites(OBSOLETE) - IIHE Grid-admin survival guide
- Solaris 10
- Adding an SSD card and configuring RAID, zpools, filesystems and shares on the new Solaris fileserver
- Linux tricks for admins
-
How to implement local PBS submission with CRAB ?(OBSOLETE) -
How to create an account for a CMS user from UCL ?(OBSOLETE) -
Deploying OS errata(OBSOLETE) - Howto benchmark a node with HEPSPEC06
- Install a new dCache pool
- Backup of the users home dirs on Jefke
- Migration of mon and its Web services
- HOWTO restart a nagios test manually
- Compile and install ROOT
- Clean creamdb
- Reboot campaign for the workernodes :
- Reboot after a kernel update
- Reboot after an OS upgrade
- Force reboot a WN with hanging nfs: echo 1 > /proc/sys/kernel/sysrq ;echo b > /proc/sysrq-trigger
- Force shutdown a WN with hanging nfs: echo 1 > /proc/sys/kernel/sysrq ;echo o > /proc/sysrq-trigger
- Central management of all the admin scripts with Git
- Help page for all iihe scripts
- Configuration of a proxy for CVMFS
- How to test NFS Performance
- Alternatives to Tetex
- A new easy method to update kernel on the workernodes
- About automatic mail sending from the cluster
- T2B Trac access configuration
- Surviving to RHEL7
- Experimental : Securing profiles with Kerberos
- Migration of T2B Wiki from Trac to MediaWiki
- Message Of The Day (motd)
- Support of Long-tail of Science
- Querying BDII
- Machine private certificate with EL7
- Cluster usage accounting statistics
- Singularity container creation
- Explaining the Apel accounting system
Quattor
- FAQ - Aide-mémoire - Howtos
- Manage repositories with quattor
- How to build an RPM from a tag in Github
- Working in CB9 (Quattor release >= 14.2)
- How to add a new version of quattor in our scdb
- Quattor and FreeIPA
- Hard disks management
- How to use metaconfig (with examples)
- Aquilon
BEgrid wiki(OBSOLETE)-
Test things(OBSOLETE) -
Lemon installation(OBSOLETE) -
Pointersto more in-depth information on quattor(OBSOLETE) -
Addinga new machine to the cluster(OBSOLETE) Automatic generation of hardware and profile templates for new workernodes(OBSOLETE: use script create_wn)Installation of a Quattor deployment server release 13.1(OBSOLETE: see quattor template for aii server)How to add a new OS to the Quattor Repository(OBSOLETE)How to migrate workernodes from CB8 to CB9(HISTORICAL)Howto build a new pysvn on a SL63 AII server(HISTORICAL)
FreeIPA
KVM virtualization
- Virtualization of the new CREAM-CE on dom02 with KVM
- Installation of the new virtualization server dom04
- Easy creation of virtual machines
- Monitoring the KVM vHosts with Ganglia
T2B Cloud
- Transforming the KVM hypervisors farm into an OpenNebula cloud
- Working in the T2B cloud
- Migrate one DB from sqlite to mysql
- Backup of the T2B Cloud
- Dealing with iPXE
- Resizing the drive of a VM
- Restoring an OpenNebula frontend from a backup
Clouds for users
gUSE/WS-PGRADE portal
Migration to EMI-3
XEN
CEPH
SEE PRIVATE WIKI
CEPH Old (deprecated)
- Understanding Ceph
- Installing Ceph with Quattor
- Experiments with Ceph
- Operating a Ceph cluster
- Deploying a new Ceph Octopus cluster
- Mounting a RBD on a client machine
- Manage the Crush map
- Manage CephFS
Logstash / Elasticsearch / Kibana (ELK)
machine: log10 | interface | index manager
Network
- Bonding of 2 interfaces + tagging of 2 vlans on the bond (PRIV+PUB)|
- Managing the Huawei CE8850-32CQ-EI 100G switch