This tutorial shows how to pass multiple parameters to Slurm. The Python script, realVol.py, calculates the real volatility for 5 different price series, with each price series saved as a file named “series<#>.txt”.
# --------------------------------
# realVol.py
# Calculate Realized Volatility
# --------------------------------
import sys
import math
def main():
sumsq = 0
n=0
# read first price in file
price_previous = sys.stdin.readline()
# read each subsequent price in file
for price_current in sys.stdin:
# calculate the daily return
daily_return = math.log(float(price_current)/float(price_previous))
# compute the sum of squares
sumsq = sumsq + daily_return**2
price_previous = price_current
n=n+1
# compute and output realized volatility
real_vol = 100*math.sqrt((252.0/n)*sumsq)
print ("realized volatility = %.2f" % (real_vol))
if __name__ == '__main__':
main()
#-------------------------------------------------------------------------
# realVol.s
# Slurm job script
#
# This script runs a python program to calculate realized volatility
# on 5 different asset price series. Each series is in a separate
# flat file. One of:
#
# series1.txt
# series2.txt
# series3.txt
# series4.txt
# series5.txt
#
# Run the python script realVol.py once for each file in the filespec:
#
# series*.txt.
#
# Specify which series*.txt file to process by using Slurm's
# array TASK_ID in the python invocation with the specification:
#
# series${SLURM_ARRAY_TASK_ID}.txt
#
# The environment variable SLURM_ARRAY_TASK_ID gets set by the Slurm
# scheduler just before running the job. In this case, the environment
# variable SLURM_ARRAY_TASK_ID will be set to one of: 1, 2, 3, 4, or 5.
#-------------------------------------------------------------------------
#SBATCH --job-name=realVolJob # Job Name
#SBATCH --mem=512m # Request 512M RAM
#SBATCH --time=00:10:00 # Wall-clock time limit dd-hh:mm:ss
#SBATCH --mail-type=END,FAIL # Send email when job ends or fails
#SBATCH --mail-user=netID@stern.nyu.edu # Replace with your email for notifications
#SBATCH --output=realVol.%A-%a.out # Write ouput; %A=array master job ID; %a=array task ID
#SBATCH --array=1-5 # Specify array task ID
# **** Note: It is always advantageous to specify conservative values
# for run time and memory. Jobs requesting fewer resources are more
# likely to be scheduled before jobs requesting more resources.
# Use environment modules to specify software and version.
# At the command line, run "module avail" get a list of availalbe software.
module purge
module load python/3.9.7
# Submit python script multiple times, each with a different data file
cd "$SLURM_SUBMIT_DIR"
infile="series${SLURM_ARRAY_TASK_ID}.txt"
echo "Array Task ID: $SLURM_ARRAY_TASK_ID"
echo "Using data file: $infile"
# Run program
python realVol.py < "$infile"
The --array directive in Slurm allows a single job script to be executed multiple times.
In this example, we use it to pass in data files dynamically.
When a job is submitted with --array, Slurm creates an array of tasks, with each task corresponding to a specific value.
The tasks are executed concurrently. Each task is identified by its array task ID.
This feature is particularly useful when running a large number of similar tasks, such as parameter sweeps or simulations with different initial conditions.
Run the Slurm script by typing in the following command:
sbatch realVol.s