PDC Configuration
nf-core pipelines have been successfully configured for use on the PDC cluster dardel. No other clusters have yet been tested, but support can be added if needed.
Getting started
The base java installation on dardel is Java 11. By loading the PDC
and Java
module, different versions (e.g. 17) are available.
To pull new singularity images, singularity must be available (e.g. through the module system) to the nextflow monitoring process, suggested preparatory work before launching nextflow is:
module load PDC Java singularity
(for reproducibility, it may be a good idea to check what versions you
have loaded with module list
and using those afterwards, e.g.
module load PDC/22.06 singularity/3.10.4-cpeGNU-22.06 Java/17.0.4
.)
No singularity images or nextflow versions are currently preloaded on dardel, to get started you can e.g. download nextflow through
wget https://raw.githubusercontent.com/nextflow-io/nextflow/master/nextflow && \
chmod a+x nextflow
The profile pdc_kth
has been provided for convenience, it expects you to
pass the project used for slurm accounting through --project
, e.g.
--project=nais2023-22-1027
.
Due to how partitions are set up on dardel, in particular the lack of long-runtime nodes with more memory. Some runs may be difficult to get through.
Note that node local scratch is not available and SNIC_TMP
as well
as PDC_TMP
point to a cluster-scratch area that will have similar
perfomance characteristics as your project storage. /tmp
points to a
local tmpfs
which uses RAM to store contents. Given that nodes don’t
have swap space anything stored in /tmp
will mean less memory is
available for your job.
Config file
// Nextflow config for use with PDC at KTH
def cluster = "unknown"
try {
cluster = ['/bin/bash', '-c', 'sacctmgr show cluster -n | grep -o "^\s*[^ ]*\s*"'].execute().text.trim()
} catch (java.io.IOException e) {
System.err.println("WARNING: Could not run scluster, defaulting to unknown")
}
params {
project = null // Naiss project allocation
config_profile_description = 'PDC profile.'
config_profile_contact = 'Pontus Freyhult (@pontus)'
config_profile_url = "https://www.pdc.kth.se/"
max_memory = 1790.GB
max_cpus = 256
max_time = 7.d
schema_ignore_params = "genomes,input_paths,cluster-options,clusterOptions,project,validationSchemaIgnoreParams"
validationSchemaIgnoreParams = "genomes,input_paths,cluster-options,clusterOptions,project,schema_ignore_params"
}
def containerOptionsCreator = {
switch(cluster) {
case "dardel":
return '-B /cfs/klemming/'
}
return ''
}
def clusterOptionsCreator = { mem, time, cpus ->
String base = "-A $params.project ${params.clusterOptions ?: ''}"
switch(cluster) {
case "dardel":
String extra = ''
if (time <= 7.d && mem <= 111.GB && cpus <= 256) {
extra += ' -p shared '
}
else if (time < 1.d) {
// Shortish
if (mem > 222.GB) {
extra += ' -p memory,main '
} else {
extra += ' -p main '
}
} else {
// Not shortish
if (mem > 222.GB) {
extra += ' -p memory '
} else {
extra += ' -p long '
}
}
if (!mem || mem < 6.GB) {
// Impose minimum memory if request is below
extra += ' --mem=6G '
}
return base+extra
}
return base
}
singularity {
enabled = true
runOptions = containerOptionsCreator()
}
process {
resourceLimits = [
memory: 1790.GB,
cpus: 256,
time: 7.d
]
// Should we lock these to specific versions?
beforeScript = 'module load PDC apptainer'
executor = 'slurm'
clusterOptions = { clusterOptionsCreator(task.memory, task.time, task.cpus) }
}