Using htmap
What is htmap?
It's a Python library that allows you to map some function calls out to an HTCondor cluster. It's really easy to use and it brings the computing power of our cluster into your Python codes.
You'll find a more detailed presentation here.
How to use htmap with our HTCondor cluster?
There are many ways to use htmap, and we haven't tested all of them. This small tutorial will demonstrate how to use htmap in a shared mode. (By "shared", it's meant that it exploits the fact that your user home directory is also available on the workernodes.) Let's start!
1. Login to a m-machine.
2. Set up a Python 3 virtual environment:
$ mkdir python-envs $ cd python-envs $ python3 -m venv htmap_env $ source htmap_env/bin/activate
3. In this new environment, you can install the htmap library:
$ python -m pip install htmap
4. Now we'll write a simple script to test htmap. Here is the content:
#!/usr/bin/env python import sys import htmap from htmap import names def _get_base_descriptors_for_shared( tag: str, map_dir ): return { 'universe': 'vanilla', 'executable': sys.executable, 'transfer_executable': 'False', 'arguments': f'{names.RUN_SCRIPT} $(component)', 'transfer_input_files': [ (map_dir / names.RUN_SCRIPT).as_posix(), ], } htmap.register_delivery_method( 'shared', descriptors_func = _get_base_descriptors_for_shared, ) htmap.settings["DELIVERY_METHOD"] = 'shared' m = htmap.map(str, range(5)) print(list(m))
5. Now you can run this script and wait...
$ ./test_htmap.py ['0', '1', '2', '3', '4']
6. To get out of the Python virtual environment when you are done:
$ deactivate
Tips and tricks
Using a second shell session with the same virtual environment as before enabled, you can monitor the progress of your htmap script with the following command:
$ htmap status --live