Policies: Difference between revisions
Jump to navigation
Jump to search
Line 24: | Line 24: | ||
*In the near future we will provide a simple system to distinguish the compilation machines from the ones where interactive jobs can run. | *In the near future we will provide a simple system to distinguish the compilation machines from the ones where interactive jobs can run. | ||
'''This policy is enforced. Processes taking more than | '''This policy is enforced. Processes taking more than 10 minutes CPU on m1->3 will be killed by the operating system. This limit is set to 5 hours on m5->m9. ''' | ||
=== Disk space usage policy === | === Disk space usage policy === | ||
Revision as of 10:37, 7 October 2015
Policies concerning the usage of local computing resources
- The following rules are put in place to allow a fair share of resources between the users. In case you violate these rules your account could be disabled.
- In case you have specific needs concerning storage or CPU please contact the site administrators on T2bSupport.
User Interface policy
- All user interfaces are running Scientific Linux 6, except m3.iihe.ac.be which is still running SL5.
- The following machines are for light tasks, such as:
- Crab submission
- small interactive root processes
- building code
- debugging code
m0.iihe.ac.be, m1.iihe.ac.be, m2.iihe.ac.be, m3.iihe.ac.be
- The following machines are available for CPU-intensive and long tasks
m5.iihe.ac.be, m6.iihe.ac.be, m7.iihe.ac.be, m8.iihe.ac.be and m9.iihe.ac.be
- In the near future we will provide a simple system to distinguish the compilation machines from the ones where interactive jobs can run.
This policy is enforced. Processes taking more than 10 minutes CPU on m1->3 will be killed by the operating system. This limit is set to 5 hours on m5->m9.
Disk space usage policy
- Users can have several locations to store their files/analysis code/final results/...
- The /user partition on the UIs (m-machines) is limited to 500 GB per user.
- This space should be used as working environment, eg. to checkout code, store results,... It should not be used to store large datasets.
- The /localgrid partition on the UIs (partition is mounted on the workernodes), with its quota shared with /user, so max(/localgrid + /user = 500 GB).
- This space serves as sandbox for input/output of jobs sent to the local batch queue.
- The /pnfs area has a limit of 2TB per user.
- This area should contain the sometimes large dataset needed for physics analysis.
- In case one needs more space, please contact the site admins here.
- The /user partition on the UIs (m-machines) is limited to 500 GB per user.
- Semi-Automatic removal of old files on /pnfs is done every 3 months.
- All files not accessed in 1 year need to be explicitely un-flagged by the user in order to keep them
- All other files CAN be marked by the user for deletion
- Several mails will be send to remind all users to do this.
- These mails will be send in a span of ~1month, after which the admins will proceed to the deletion of all flagged files.
- More information is found on the deletion page: http://mon.iihe.ac.be/OldPnfsFiles
- If you need an account on this page, please ask the admins (grid_adminNOSPAM@listserv.vub.ac.be)
Back-up procedures of files
- The is a local backup (snapshot mechanism) of the user home directories (for more detailed info, see Backup
- This back-up is made every day and we can go back day by day till last week. This is to address e.g. user small mistaken deletions.
- Users are strongly advised to not solely rely on this backup. Using a versioning system (SVN or CVS) should prevent accidental removal of files and allowas a user to go back to a previous file when the file was messed up. We don't maintain a CVS repository ourselves but the CMS one should be used, more info here
- The entire user home directories are backed up every week in a physically separated hardware. This is to address catastrophe scenario.
Memory usage on the grid
- To protect the grid, there is an upper memory limit per job of 2.0GB (this is larger than what is asked by CMS) for the physical and the virtual memory.
- If your job exceeds this limit, it will be killed by the queueing system.
- in you crab error log, you will find an error code 271
- if you use direct submission, the reason will be clearly stated in your error log