Introduction
Run
getSlurmExamples
at the command line ofrnd
orvleda
to have all the tutorials copied to your home directory.
This tutorial is intended to familiarize you with the process of running and scheduling jobs on the Simple Python Job on Slurm grid engine. You will use the Simple Python Job on Slurm Grid to run a basic Python script that outputs Hello World to the console.
Make sure you’re logged in to the SSH terminal, instructions can be found here.
Writing Python code
Using a text editing program such as vim, nano, or emacs, create a python script. Below is an example called ‘hello-world.py’:
print("Hello World!")
Creating the Slurm Script
To run our python script we need to submit a batch job to Slurm to allocate the necessary compute resources. Slurm expects a ‘.sbatch’ file to execute the job. The file should contain both commands specific for Slurm to interpret as well as programs for it execute. Below is a simple example of a batch job to run the Python script we just created, the file is named ‘hello-python.sbatch’:
#!/bin/bash
#SBATCH --job-name=hello # Set the name of the job
#SBATCH --output=hello.out # Specify the name of the output file
#SBATCH --export=ALL # Export all environment variables
#SBATCH --time=00:01:00 # Set the maximum runtime of the job to 1 minute
#SBATCH --mem=4G # Request 4 gigabytes of memory for the job
#SBATCH --mail-type=BEGIN,END,FAIL # Send email notifications
#SBATCH --partition=test # Specify the partition (queue) to submit the job to
module purge # Start with a clean environment
module load python/3.6.5 # Load the "python/3.6.5" environment module
python hello-world.py # Run the script using the loaded Python interpreter
The first few lines specify what are known as Slurm directives. Slurm directives are special lines in a Slurm batch script that specify job options, resource requests, and scheduling parameters for a job submitted to a Slurm cluster.
The module
command is used to manage environment modules, which allow users to easily load and unload software and environment settings on systems such as HPC clusters. The purge
subcommand clears all loaded modules, while the load
subcommand loads a specific module.
The python hello.py
command executes the Python script named “hello.py” using the Python interpreter. The script is run with the version of Python that was loaded via the environment module.
Executing the Slurm Script
You then submit the job with the following command in your terminal:
sbatch hello-python.sbatch
The command submits the job to the cluster’s queue, where it waits for the necessary resources to become available. The time spent waiting in the queue depends on the number of other jobs currently running on the cluster. To check the status of your submitted jobs, use the following command:
squeue -u $USER