I want to build two datasets with my ruleset (a minimally modified version of the ncov worklow) in the same Snakemake profile and take advantage of parallelism, so I went and tried this:
- Defined the second build in the profile’s
builds.yaml. - Set my cluster job (AWS Batch, self-managed (= not Nextstrain CLI’s AWS Batch mode)) to 8 vCPUs.
- Set
cores: 8andset-threads: tree=4in the profile’sconfig.yaml.
I understand #3 should allow two instances of the tree rule to run in parallel, or one in parallel with one instance of refine, or two instances of refine in parallel. But instead I see:
- Multiple instances of
subsampledo run in parallel; - Only one instance of
treeorrefineruns at a time.
I read up on Snakemake’s mem_mb resource and looked at the defaults set for that in the ncov workflow’s main_workflow.smk, wondered whether that was limiting my parallelism… but even if I go to a container with 32 GiB of memory and call Snakemake with --resources mem_mb=30720, still no parallelism. Actually worse, I saw parallelism go away for the subsample jobs.
The main_workflow.smk file (at ncov commit d7dc587) defines the memory usage of the tree rule like this:
mem_mb=lambda wildcards, input: 40 * int(input.size / 1024 / 1024)
…and the filtered.fasta files that are being used as input to those steps are 145.3 MB and 125.1 MB, so 40 times that respectively is 5,812 MB and 5,004 MB; a 16 GiB EC2 instance should be more than enough to run them in parallel! Questions:
- Am I missing some other lever that needs to be pulled here?
- How do I troubleshoot Snakemake’s parallel scheduling decisions? I looked through the documentation for the command-line options and nothing jumped out to my eye.