HTCUpgrade: Difference between revisions

From T2B Wiki
Jump to navigation Jump to search
(Created page with "= Overview = We plan to upgrade HTCondor from version 8.9 to 9.0.<br> As it is a major change, it will require a complete reinstallation of all services.<br> The main benefits of the new version is to be compatible with tokens, the replacement of certificates for GRID authentification.<br> = Plan = As of 04/07, already 70% of job slots have been migrated.<br> Most of GRID jobs are already using the new version. <br>For you to have a more transparent experience during t...")
 
 
(13 intermediate revisions by the same user not shown)
Line 1: Line 1:
= Overview =
= Overview =
We plan to upgrade HTCondor from version 8.9 to 9.0.<br>
We plan to upgrade HTCondor from version 8.9 to 9.0.<br>
The main benefits of the new version is to be compatible with tokens, the replacement of certificates for GRID authentification.<br>
As it is a major change, it will require a complete reinstallation of all services.<br>
As it is a major change, it will require a complete reinstallation of all services.<br>
The main benefits of the new version is to be compatible with tokens, the replacement of certificates for GRID authentification.<br>
But this '''WILL NOT''' impact your jobs, as compute resources are drained only when empty. You will be able to access jobs on both cluster versions.


= Plan =
= Migration Schedule (Finished)=
As of 04/07, already 70% of job slots have been migrated.<br>
As of 04/07, already 70% of job slots have been migrated.<br>
Most of GRID jobs are already using the new version.
Most of GRID jobs are already using the new version.


<br>For you to have a more transparent experience during this upgrade, please always use mhort.iihe.ac.be or mlong.iihe.ac.be to connect to the cluster.
<br>'''For you to have a more transparent experience during this upgrade, please always use mshort.iihe.ac.be or mlong.iihe.ac.be to connect to the cluster.'''
<br>
 
 
* '''Monday 04/07 12AM'''
** m1 will be excluded from mshort <span style="color:green">'''  V'''</span>
** m4/m5 will be excluded from mlong <span style="color:green">'''  V'''</span>


* '''Monday 8PM'''
* '''Tuesday 05/07 8PM'''
** m1 will be excluded from mshort
** m1/m4/m5 will not be accessible anymore and will be reinstalled <span style="color:green">'''  V'''</span>
** m4/m5 will be excluded from mlong


* '''Tuesday 8PM'''
* '''Wednesday 06/07 12AM'''
** m1/m4/m5 will not be accessible anymore and will be reinstalled
** m2/m3 will be excluded from mshort <span style="color:green">'''  V'''</span>
** m6/m7 will be excluded from mshort <span style="color:green">'''  V'''</span>
** mshort will lead to upgraded HTC on m1 <span style="color:green">'''  V'''</span>
** mlong will lead to upgraded HTC on m4/m5 <span style="color:green">'''  V'''</span>


* '''Wednesday 12AM'''
* '''Wednesday 06/07 8PM'''
** m2/m3 will be excluded from mshort
** m2/m3/m6/m7 will not be accessible anymore and will be reinstalled <span style="color:green">'''  V'''</span>
** m6/m7 will be excluded from mshort
** mshort will lead to upgraded HTC on m1
** mlong will lead to upgraded HTC on m4/m5


* '''Wednesday 8PM'''
* '''Thursday 07/07 12AM'''
** m2/m3/m6/m7 will not be accessible anymore and will be reinstalled
** mshort will lead again to m1/m2/m3 <span style="color:green">'''  V'''</span>
** mlong will lead again to m4/m5/m6/m7 <span style="color:green">'''  V'''</span>


= How to keep access to your jobs =
= How to keep access to your jobs =
As the 2 HTCondor version will be live in parallel, you will still be able to manage your jobs on the old HTC version from the upgrage mX machines (and vice-versa).
As the 2 HTCondor versions will be live in parallel, you will still be able to manage your jobs on the old HTC version from the upgraded mX machines (and vice-versa).




* '''Old HTC version:'''
* '''Old HTC version:'''
<pre>condor_q <==> condor_q -n schedd01 <==> condor_q -pooltestumd-htcmaster.wn.iihe.ac.be -name schedd01</pre>
<pre>condor_q <==> condor_q -name schedd01 <==> condor_q -pool testumd-htcmaster.wn.iihe.ac.be -name schedd01.wn.iihe.ac.be</pre>


* '''New HTC version:'''
* '''New HTC version:'''
<pre>condor_q <==> condor_q -n schedd03 <==> condor_q -pool cm.wn.iihe.ac.be -n schedd03</pre>
<pre>condor_q <==> condor_q -name schedd03 <==> condor_q -pool cm.wn.iihe.ac.be -name schedd03.wn.iihe.ac.be</pre>


So once the mX machines are upgraded, you will be able to access the old HTC verion cluster with:
So once the mX machines are upgraded, you will be able to access the old HTC version cluster with:
<pre>condor_q -pooltestumd-htcmaster.wn.iihe.ac.be -name schedd01</pre>
<pre>condor_q -pool testumd-htcmaster.wn.iihe.ac.be -name schedd01.wn.iihe.ac.be</pre>


Or on the contrary, directly start submitting on the new HTC version cluster from the yet-to-be-upgrade mX machines (ie you can do it right now) with:
Or on the contrary, directly start submitting on the new HTC version cluster from the yet-to-be-upgrade mX machines (ie you can do it right now) with:
<pre>condor_submit -pool cm.wn.iihe.ac.be -n schedd03 MYFILE.SUB
<pre>condor_submit -pool cm.wn.iihe.ac.be -name schedd03.wn.iihe.ac.be MYFILE.SUB
condor_q -pool cm.wn.iihe.ac.be -n schedd03</pre>
condor_q -pool cm.wn.iihe.ac.be -name schedd03.wn.iihe.ac.be</pre>

Latest revision as of 09:32, 7 July 2022

Overview

We plan to upgrade HTCondor from version 8.9 to 9.0.
The main benefits of the new version is to be compatible with tokens, the replacement of certificates for GRID authentification.
As it is a major change, it will require a complete reinstallation of all services.
But this WILL NOT impact your jobs, as compute resources are drained only when empty. You will be able to access jobs on both cluster versions.

Migration Schedule (Finished)

As of 04/07, already 70% of job slots have been migrated.
Most of GRID jobs are already using the new version.


For you to have a more transparent experience during this upgrade, please always use mshort.iihe.ac.be or mlong.iihe.ac.be to connect to the cluster.


  • Monday 04/07 12AM
    • m1 will be excluded from mshort V
    • m4/m5 will be excluded from mlong V
  • Tuesday 05/07 8PM
    • m1/m4/m5 will not be accessible anymore and will be reinstalled V
  • Wednesday 06/07 12AM
    • m2/m3 will be excluded from mshort V
    • m6/m7 will be excluded from mshort V
    • mshort will lead to upgraded HTC on m1 V
    • mlong will lead to upgraded HTC on m4/m5 V
  • Wednesday 06/07 8PM
    • m2/m3/m6/m7 will not be accessible anymore and will be reinstalled V
  • Thursday 07/07 12AM
    • mshort will lead again to m1/m2/m3 V
    • mlong will lead again to m4/m5/m6/m7 V

How to keep access to your jobs

As the 2 HTCondor versions will be live in parallel, you will still be able to manage your jobs on the old HTC version from the upgraded mX machines (and vice-versa).


  • Old HTC version:
condor_q <==> condor_q -name schedd01 <==> condor_q -pool testumd-htcmaster.wn.iihe.ac.be -name schedd01.wn.iihe.ac.be
  • New HTC version:
condor_q <==> condor_q -name schedd03 <==> condor_q -pool cm.wn.iihe.ac.be -name schedd03.wn.iihe.ac.be

So once the mX machines are upgraded, you will be able to access the old HTC version cluster with:

condor_q -pool testumd-htcmaster.wn.iihe.ac.be -name schedd01.wn.iihe.ac.be

Or on the contrary, directly start submitting on the new HTC version cluster from the yet-to-be-upgrade mX machines (ie you can do it right now) with:

condor_submit -pool cm.wn.iihe.ac.be -name schedd03.wn.iihe.ac.be MYFILE.SUB
condor_q -pool cm.wn.iihe.ac.be -name schedd03.wn.iihe.ac.be