UpgradeWNstoSL5.5
Upgrade of Workernodes in CB6 to SL5.5
- Currently all nodes in CB6 are running SL5.3, the goal is to upgrade them to SL5.5
- The testing node is node16-5; first step is to out it offline so it can drain.
pbsnodes -o node16-5.wn.iihe.ac.be
- In the template for the os version the entry for node16-5 is upgraded to sl550-x86_64
/CBv6/cfg/sites/iihe-production/site/os_version_db.tpl
- This file is then compiled in the CDB and commited
- To make the changes do a runcheck on ccq3
ssh ccq3 cd /opt/CB6/svncheck/ ./runcheck
- And do the changes
aii-shellfe --remove node16-5.wn.iihe.ac.be #remove the current configuration for this machine aii-shellfe --configure node16-5.wn.iihe.ac.be #create the new kickstart file for this machine aii-shellfe --install node16-5.wn.iihe.ac.be#Update of DHCP and PXE for this machine to be installed at next boot
- To check the status simply issue something like this
for line in $(cat wn_list); do aii-shellfe --status $line; done
REMARK: it is better to do the remove before the runcheck to not clash the xml file with the existing configuration
- Reboot the machine
ssh node16-5.wn reboot -n
- Once the machine is back you will see in the spma log file the following error message
/var/log/smpa.log error code: cpio: open failed - Too many open files
- To fix this you have to reboot the machine after moving the kickstart log file
mv ks-post-install.log ks-post-install.log.old
- Once the quattor installation finished successfully you still have to reboot a last time
Re-installation campaign
- The re-installation campaign is executed in the following steps
- Adapt the file os_version_db.tpl in the quattor database, compile, commit and do a runcheck
- First of all, all nodes need to reconfigured with the aii-shellfe command. In the following location you find a little script to do this:
[root@ccq3] /root/joris/nodeReinstallation.sh
- The script takes as an argument a file listing all the workernodes to be removed. The file was generated with the a script found here:
[root@ccq] /root/stephane/generate_list_of_online_nodes_CB6.sh
- Once the script is executed on ccq3 it is time to move to ccq to perform the reboot of all the nodes. To do this Stephane has written a nice script. More details about this script can be found here.
- The script has been adapted by Joris to allow the re-installation of a workernode. Technically a few more reboot cycles have been added with the appropriate handles to check if the installation was done in a correct way. The script can be found here and is executed with the same options as the original script
cd /root/stephane/UserCode/T2B_IIHE/reboot_wns ./reboot_wns.pl --init wnfile_list_sl55campaign ./reboot_wns.pl --start
- In this directory there are some small scripts to test various things, eg.
- ./print_wn_job_number.pl wn_list: shows the number of jobs running on this node
- ./clear_after_test.pl: removes all lists of workernodes
List of workernodes that are under update procedure
workernode | status | done |
---|---|---|
node15-1.wn.iihe.ac.be | y + on | |
node15-2.wn.iihe.ac.be | y + on | |
node15-3.wn.iihe.ac.be | y + on | |
node15-4.wn.iihe.ac.be | not in production | |
node15-5.wn.iihe.ac.be | not in production | |
node15-6.wn.iihe.ac.be | not in production | |
node15-7.wn.iihe.ac.be | not in production | |
node15-8.wn.iihe.ac.be | y + on | |
node16-1.wn.iihe.ac.be | y + on | |
node16-2.wn.iihe.ac.be | reserved | y |
node16-3.wn.iihe.ac.be | y + on | |
node16-4.wn.iihe.ac.be | y + on | |
node16-5.wn.iihe.ac.be | y + on | |
node16-6.wn.iihe.ac.be | removed | |
node16-7.wn.iihe.ac.be | y + on | |
node16-8.wn.iihe.ac.be | reserved | |
node16-9.wn.iihe.ac.be | y + on | |
node16-10.wn.iihe.ac.be | reserved | y |
node17-1.wn.iihe.ac.be | reserved | y |
node17-2.wn.iihe.ac.be | n/a | |
node17-3.wn.iihe.ac.be | y + on | |
node17-4.wn.iihe.ac.be | y + on | |
node17-5.wn.iihe.ac.be | reserved | y |
node17-6.wn.iihe.ac.be | y + on | |
node17-7.wn.iihe.ac.be | y + on | |
node17-8.wn.iihe.ac.be | y + on | |
node17-9.wn.iihe.ac.be | y + on | |
node17-10.wn.iihe.ac.be | y + on | |
node17-11.wn.iihe.ac.be | y + on | |
node17-12.wn.iihe.ac.be | y + on | |
node17-13.wn.iihe.ac.be | y + on | |
node17-14.wn.iihe.ac.be | down | |
node18-1.wn.iihe.ac.be | y + on | |
node18-2.wn.iihe.ac.be | y + on | |
node18-3.wn.iihe.ac.be | y + on | |
node18-4.wn.iihe.ac.be | y + on | |
node18-5.wn.iihe.ac.be | y + on | |
node18-6.wn.iihe.ac.be | y + on | |
node18-7.wn.iihe.ac.be | y + on | |
node18-8.wn.iihe.ac.be | y + on | |
node19-1.wn.iihe.ac.be | y + on | |
node19-2.wn.iihe.ac.be | y + on | |
node19-3.wn.iihe.ac.be | y + on | |
node19-4.wn.iihe.ac.be | y + on | |
node19-5.wn.iihe.ac.be | y + on | |
node19-6.wn.iihe.ac.be | y + on | |
node19-7.wn.iihe.ac.be | y + on | |
node19-8.wn.iihe.ac.be | y + on | |
node19-9.wn.iihe.ac.be | y + on | |
node19-10.wn.iihe.ac.be | y + on | |
node19-11.wn.iihe.ac.be | y + on | |
node19-12.wn.iihe.ac.be | y + on | |
node19-13.wn.iihe.ac.be | y + on | |
node19-14.wn.iihe.ac.be | y + on | |
node19-15.wn.iihe.ac.be | y + on | |
node19-16.wn.iihe.ac.be | y + on | |
node19-17.wn.iihe.ac.be | y + on | |
node19-18.wn.iihe.ac.be | y + on | |
node19-19.wn.iihe.ac.be | y + on | |
node19-20.wn.iihe.ac.be | y + on | |
node19-21.wn.iihe.ac.be | y + on | |
node19-22.wn.iihe.ac.be | y + on | |
node19-23.wn.iihe.ac.be | y + on | |
node19-24.wn.iihe.ac.be | y + on | |
node19-25.wn.iihe.ac.be | y + on | |
node19-26.wn.iihe.ac.be | y + on | |
node19-27.wn.iihe.ac.be | y + on | |
node19-28.wn.iihe.ac.be | y + on | |
node19-29.wn.iihe.ac.be | y + on | |
node19-30.wn.iihe.ac.be | y + on | |
node19-31.wn.iihe.ac.be | y + on | |
node19-32.wn.iihe.ac.be | y + on | |
node20-1.wn.iihe.ac.be | y + on | |
node20-2.wn.iihe.ac.be | y + on | |
node20-3.wn.iihe.ac.be | y + on | |
node20-4.wn.iihe.ac.be | y + on | |
node20-5.wn.iihe.ac.be | y + on | |
node20-6.wn.iihe.ac.be | y + on | |
node20-7.wn.iihe.ac.be | y + on | |
node20-8.wn.iihe.ac.be | y + on | |
node20-9.wn.iihe.ac.be | y + on | |
node20-10.wn.iihe.ac.be | y + on | |
node20-11.wn.iihe.ac.be | y + on | |
node20-12.wn.iihe.ac.be | y + on | |
node20-13.wn.iihe.ac.be | y + on | |
node20-14.wn.iihe.ac.be | y + on | |
node20-15.wn.iihe.ac.be | y + on | |
node20-16.wn.iihe.ac.be | y + on | |
node20-17.wn.iihe.ac.be | y + on | |
node20-18.wn.iihe.ac.be | y + on | |
node20-19.wn.iihe.ac.be | y + on | |
node20-20.wn.iihe.ac.be | y + on | |
node20-21.wn.iihe.ac.be | y + on | |
node20-22.wn.iihe.ac.be | y + on | |
node20-23.wn.iihe.ac.be | need to be included in maui | |
node20-24.wn.iihe.ac.be | need to be included in maui |