Run
getSlurmExamples
at the command line ofrnd
orvleda
to have all the tutorials copied to your home directory.
This tutorial will teach you how to pass multiple parameters to Slurm. The Python script, realVol.py
, calculates the real volatility for 5 different price series, with each price series saved as a file named “series<#>.txt”.
#------------------------------------
# realVol.py
# Calculate Real Volitility
#------------------------------------
def main():
sumsq = 0
n=0
# read first price in file
price_previous = sys.stdin.readline()
# read each subsequent price in file
for price_current in sys.stdin:
# calculate the daily return
daily_return = math.log(float(price_current)/float(price_previous))
# compute the sum of squares
sumsq = sumsq + daily_return**2
price_previous = price_current
n=n+1
# compute and output realized volatility
real_vol = 100*math.sqrt((252.0/n)*sumsq)
print ("realized volatility = %.2f" % (real_vol))
if __name__ == '__main__':
main()
The Slurm script
#!/bin/bash
##------------------------------------
# realVol.sbatch
# Slurm job script
#------------------------------------
#SBATCH --job-name=realVol # set the name of the job
#SBATCH --array=1-5 # create an array job with task IDs ranging from 1 to 5
#SBATCH --export=ALL # export env variables to the compute nodes
#SBATCH --mem=512m # set the amount of memory required for each task to 512 megabytes
#SBATCH --mail-type=BEGIN,END,FAIL # send email notifications when the job starts, ends or fails
#SBATCH --output=realVol.%A-%a.out # set the output file name for each task. The %A is replaced by the job ID and %a is replaced by the task ID.
#SBATCH --partition=test # specify the partition to run the job in. In this case, it is "test".
#SBATCH --time=00:10:00 # set the maximum time limit for each task to 10 minutes. If a task exceeds this time limit, it will be terminated.
The --array
directive in Slurm allows a single job script to be executed multiple times, each with a different input parameter. When a job is submitted with --array
, Slurm automatically creates an array of tasks, with each task corresponding to a specific value or range of values for the input parameter. The tasks can be executed in parallel on different nodes, and each task can be identified using its task ID, which is automatically assigned by Slurm.
This feature is particularly useful when running a large number of similar tasks, such as parameter sweeps or simulations with different initial conditions. For our use case, we are using it to pass in files dynamically.
echo python realVol.py < `bash -c " unset noglob; ls series${SLURM_ARRAY_TASK_ID}*.txt"`
python realVol.py < `bash -c " unset noglob; ls series${SLURM_ARRAY_TASK_ID}*.txt"`
You can run the Slurm script by typing in the following command:
sbatch realVol.sbatch