In this tutorial, you will run an R script. This script generates simulated data points with random noise, fits a cubic smoothing spline to the data, and plots the data points, original model, and smoothing spline.
R Script
# fitspline.R
# Set the initial seed for the random number generator.
set.seed(sample(1:1000, 1))
# Create n = 100 random data points.
# x is n equally spaced values from 0 to 1.
n <- 100
x <- (1:n)/n
# The model in this simulation (no random error)
mval <- ((exp(x/3) - 2 * exp(-7 * x) + sin(9 * x)) + 1)/3
# Generate n independent normal random variates with mean 0
# and variance derived from the task id
tid <- as.integer(Sys.getenv("SLURM_ARRAY_TASK_ID"))
v <- tid/100
noise <- rnorm(n, 0, v)
# Simulated observed values (model value + noise)
y <- mval + noise
# Alternatively, you can read data from a file:
# dat <- read.table("dataset.dat", header = TRUE)
# attach(dat)
# Fit a cubic smoothing spline to the data
# Use GCV score and all basis functions
fit <- smooth.spline(x, y, cv = FALSE, all.knots = TRUE)
# Create a graph that shows the data, the smoothing spline,
# and the original model
r <- paste("result_", tid, ".ps", sep = "")
postscript(r, height = 8, width = 10, horizo = FALSE)
# Plot data points
plot(x, y, xlab = "x", ylab = "y", cex = 0.5)
# Plot original model values without noise
lines(x, mval, lty = 2)
# Plot smooth spline fit
lines(fit$x, fit$y)
# Save the graph to a PS file
graphics.off()
# To view with Ghostscript, at the
# command line type: gs result_X.ps
# where X is the corresponding task id
Slurm Script
#!/bin/bash
#
# [ fitspline.sbatch ]
#
# This script demonstrates how to run an R job, specifically,
# how to fit a smooth spline model. It uses a range of SLURM_ARRAY_TASK_ID's
# as a noise parameter for the model.
# --------------------------------------------------------------------
#
#
# Use environment modules to specify software and version. Run the
# "module av" command to get list of installed software versions,
# then replace the <X.Y.Z> with actual software version as shown below.
#
# module purge
# module load R/<X.Y.Z>
#
# For example:
#
# rnd> module avail R
# ---------------- /gridapps/modules -----------------
# R/4.0.2 R/4.3.2
#
# rnd>
# rnd> module purge
# rnd> module load R/4.3.2
# rnd>
# --------------------------------------------------------------------
# Submit this job via 'sbatch' which accepts these command line options:
#SBATCH --job-name=fitsplineJob # Define the name of the current job.
#SBATCH --output=fitsplineJob.Rout # Define the stdout (the terminal output)
# file name.
#SBATCH --error=fitsplineJob.err # Define the stderr (the terminal error output)
# file name. If '--error=' is not specified,
# both stdout and stderr are output to the
# same output file (in this case to file:
# fitspline.out).
#SBATCH --export=ALL # Export ALL environment variables of the
# submitting process to the submitted job
# including the current working directory.
#SBATCH --mail-type=END,FAIL # Tell Slurm's sbatch to send email when the
# job ENDs or when the job FAILs. Options
# that may be specified with '--mail=' are
# NONE or any combination of:
# BEGIN,END,FAIL,REQUEUE,STAGE_OUT. There
# is no default for '--mail-type='.
#SBATCH --mail-user=your-stern-netid@stern.nyu.edu
# Specify your Stern email address to notify.
#SBATCH --mem=512m # Specifies the maximum memory this program will
# be allocated. The 512M specifies 512
# Megabytes. Memory units may be: k|m|g|t
# for kilo/mega/giga/tera-bytes. The default
# units are Megabytes.
#SBATCH --time=00:10:00 # The wall-clock time limit for this job, here 10 min.
#SBATCH --partition=test # Specify to which partition to submit the job.
# A partition is group of compute servers
# which may potentially run this job.
# The 'test' partition is the default partition.
# The maximum time you
# may request for the 'test' partition is
# 100-hours (--time=100:00:00).
# --------------------------------------------------------------------
# How to submit an array job to Stern Slurm cluster.
# Pass array variable values 5, 10, 15 to the R script.
#
# sbatch --array=5-15:5 fitspline.sbatch
# --------------------------------------------------------------------
# Select stat package and version to use
module purge
module load R/4.3.2
R CMD BATCH --no-save --no-restore fitspline.R fitspline.$SLURM_ARRAY_TASK_ID.Rout
To run the SLURM script, execute this command:
sbatch --array=5-15:5 fitspline.sbatch