I want to build two datasets with my ruleset (a minimally modified version of the ncov worklow) in the same Snakemake profile and take advantage of parallelism, so I went and tried this:
- Defined the second build in the profile’s
builds.yaml
. - Set my cluster job (AWS Batch, self-managed (= not Nextstrain CLI’s AWS Batch mode)) to 8 vCPUs.
- Set
cores: 8
andset-threads: tree=4
in the profile’sconfig.yaml
.
I understand #3 should allow two instances of the tree
rule to run in parallel, or one in parallel with one instance of refine
, or two instances of refine
in parallel. But instead I see:
- Multiple instances of
subsample
do run in parallel; - Only one instance of
tree
orrefine
runs at a time.
I read up on Snakemake’s mem_mb
resource and looked at the defaults set for that in the ncov workflow’s main_workflow.smk
, wondered whether that was limiting my parallelism… but even if I go to a container with 32 GiB of memory and call Snakemake with --resources mem_mb=30720
, still no parallelism. Actually worse, I saw parallelism go away for the subsample
jobs.
The main_workflow.smk
file (at ncov commit d7dc587
) defines the memory usage of the tree
rule like this:
mem_mb=lambda wildcards, input: 40 * int(input.size / 1024 / 1024)
…and the filtered.fasta
files that are being used as input to those steps are 145.3 MB and 125.1 MB, so 40 times that respectively is 5,812 MB and 5,004 MB; a 16 GiB EC2 instance should be more than enough to run them in parallel! Questions:
- Am I missing some other lever that needs to be pulled here?
- How do I troubleshoot Snakemake’s parallel scheduling decisions? I looked through the documentation for the command-line options and nothing jumped out to my eye.