Docker Container
Chiltepin provides a Docker container environment for building and running Chiltepin applications. The container uses Docker Compose to build a multi-node Slurm cluster that serves as a backend for running workflow applications.
This is particularly useful for:
Development and testing on macOS or Windows
Creating a consistent testing environment
Demonstrating Chiltepin capabilities without access to HPC systems
Prerequisites
You need to have Docker and Docker Compose installed on your system:
Building the Container
To build the Docker container environment:
$ cd docker
$ docker compose -f docker-compose.yml up -d
This will build and start a multi-node Slurm cluster. The process may take several minutes the first time as it downloads and builds the necessary images.
Accessing the Container
Once the container is built and running, you can access it with a bash shell:
$ docker exec -it frontend bash -l
This will log you into the frontend node of the Slurm cluster.
Installing Chiltepin in the Container
Inside the container, the repository is mounted at ~/chiltepin. To install
Chiltepin in editable mode:
(container) $ cd chiltepin
(container) $ pip install -e .[test]
Note
The [test] option ensures that dependencies required for running the tests
are installed.
Running Tests in the Container
After installation, you can run the test suite using the Docker-specific configuration:
(container) $ pytest --config=tests/configs/docker.yaml
For more verbose output:
(container) $ pytest -s -vvv --config=tests/configs/docker.yaml
Adjusting Core Count
Depending on how many cores your machine has and how many you’ve allocated to Docker,
you may need to modify the cores_per_node setting in tests/configs/docker.yaml
to match your machine’s specifications for all tests to pass.
Container Architecture
The Docker environment consists of:
Frontend node: Where you interact with the system and submit jobs
Compute nodes: Multiple Slurm compute nodes for running jobs
Master node: The Slurm controller that manages job scheduling and resource allocation
Shared volume: A shared directory for data and code accessible by all nodes
This simulates a real HPC cluster environment with job scheduling and multi-node execution.
Stopping the Container
To stop the container environment:
$ cd docker
$ docker compose -f docker-compose.yml down
To stop and remove all data:
$ cd docker
$ docker compose -f docker-compose.yml down -v
Troubleshooting
- Container won’t start:
Check that Docker has sufficient resources allocated (CPU, memory, disk space).
- Tests fail with resource errors:
Reduce the
cores_per_nodevalues in the Docker configuration file.- Cannot access mounted repository:
Ensure Docker has permission to access the repository directory on your host system. If not, clone the repository directly inside the container and install Chiltepin there.