HelpPageForAllScripts
CE
- manage_queues.py
Usage: manage_queues.py [OPTION] Manage Torque queues -h, --help Show this help -s, --status Show queues status --close-all Close all opened queues --open-all Open all closed queues --show-all Show all attributes for all queues --close QUEUE_NAME Close queue QUEUE_NAME --open QUEUE_NAME Open queue QUEUE_NAME --show QUEUE_NAME Show all attributes for queue QUEUE_NAME
- myce.sh
Valid args : - list-running-nodes ( -r ) - list-up-nodes ( -u ) - list-down-nodes ( -d ) - count-cpu (-c) [queue]
- Releasehold
releaseholds jobs. For all, just do: ./Releasehold Deferred
- Requeue_jobs
No argument !! Need 2 arguments to grep on
- Restart-creamce
Will restart the creamce
- Restart_pbs_maui
Will restart maui & pbs in the right sequence
- parse_torque_accounting_log.py
Will show summary info of all jobs for the last 8 days
- torque-user-info.py
Using same infra as parse_torque_accounting_log, will print details of all jobs per user. More details here.
QNAT
- dns-manager
dns-manager ADD|DELETE $HOST $IP example: ./dns-manager ADD schtroumpf.wn.iihe.ac.be 192.168.66.6
- get-free-ip
Usage: get-free-ip 192.168[.10][.65] [-d] IP : The ip provided must be in one of the ranges owned by IIHE. You can omit AT MOST the 2 last numbers of an ip. Giving a complete ip will return the first free one starting from the provided ip. It is useful to get IPs from subnets not /16 or /24, eg: get-free-ip 193.58.172.128 will give free IPs in 193.58.172.128/25 -d: Enables debugging. Must be put after IP. All messages go to stderr, only the final ip is printed to stdout. [Only errors that exit the program print their messages to stdout too.]
- reboot-wns
Usage: reboot-wns --pre-reboot reboot-wns --post-reboot reboot-wns --server [--force-reboot] [--offline-slots <NSLOTS>] --server: start on qnat the server mode to reboot all WNs automatically Logfile stored in /tmp/reboot-wns.log OPTIONS: [--force-reboot] : will use special reboot forcing it, rather than reboot funtion. Useful when nfs is hanging [--offline-slots <NSLOTS>] : specify the amount of slots you want to have offline. Defaults to 700 [--continue] : will lookup the status of the last time reboot-cluster --server was run and will continue from there. This allows to specify what still needs to reboot by editing the files: /tmp/listOfNodesStillToReboot /tmp/listOfNodesPutBackInProd --pre-reboot: executed on WN to check if reboot can be performed. Exit code of 1 if the test sequence fails --post-reboot: executed on WN to check if node can go back into prod Exit code of 1 if the test sequence fails