I need help trying to troubleshoot the following error:
localrules directive specifies rules that are not present in the Snakefile:
upload
download
Building DAG of jobs…
WorkflowError:
Target rules may not contain wildcards. Please specify concrete files or a rule without wildcards.
localrules directive specifies rules that are not present in the Snakefile:
upload
download
Building DAG of jobs…
MissingInputException in line 97 of /blue/bphl-florida/schmedess/data/SARS-CoV-2/A/nextstrain_builds/20210330_A_global/ncov/Snakefile:
Missing input files for rule all:
auspice/ncov_A.json
auspice/ncov_A_tip-frequencies.json
I normally run my nextstrain builds using a profile with subsampling. However, for this build I’m just trying to make with all samples in the input file (no subsampling). I keep getting the following error. Can you help me troubleshoot?
Hi @seschmedes, just to make sure I understand how you’ve setup your build, do you currently have a builds.yaml file that does not have a subsampling top-level key? The current workflow doesn’t allow you to skip the subsampling step, although we’ve talked about this as a feature to implement.
The best workaround to get the effect of including all sequences while subsampling is to define an empty subsampling rule. For example, in the example getting started profile, you could define the following simple build and subsampling scheme:
builds:
global:
subsampling_scheme: getting-started
region: global
subsampling:
getting-started:
# Define one subsampling rule in the `getting-started` scheme that selects all
# input sequences.
all:
# Define an empty placeholder in the `all` dictionary, so Snakemake will know
# this is a dictionary.
empty:
We use a similar approach in the example multiple inputs profile where two different metadata sets get merged with a column named aus added during the merging. This example build uses just the exclude key to filter out strains that are not from the Australian dataset, effectively keeping all data from that Australian dataset.
Hi @jlhudd, I have a bit of a follow up to this (or maybe a separate issue with a similar error message).
I think I am encountering a somewhat related error, though I’ve made some major changes to the pipeline by using a totally custom subsampling scheme that has replaced the entirety of the standard workflow up to the augur tree step, so hopefully this is still a relevant question.
I have some builds that are also getting the Missing input files for rule all error. These builds are all defined in the builds.yaml file (sorry for using screenshots instead of direct blockquotes, there is a lot of less important cruft from other builds that I’m omitting, and I was having trouble formatting my indentations ):
When I try to run things I get the input file error:
$ time snakemake --cores 35 --profile my_profiles/sars-cov-2-belgium/ -p
Building DAG of jobs…
MissingInputException in line 75 of ~/projects/sars-cov-2-belgium/Snakefile:
Missing input files for rule all:
auspice/sars-cov-2-belgium_P.1.json
auspice/sars-cov-2-belgium_P.1_tip-frequencies.json
auspice/sars-cov-2-belgium_B.1.214_tip-frequencies.json
auspice/sars-cov-2-belgium_B.1.214.json
real 0m0.397s
user 0m0.344s
sys 0m0.046s
I should also note that the builds shown in the builds.yaml file that don’t appear as errors here are the ones that already have their json files in the auspice directory, as they were completed under a previous version of the pipeline.
My only guesses at this time are that either I broke a connection between Snakefile and main_workflow.smk somehow, or that something with how I’ve formatted my builds.yaml file is wrong.
In the short term, you can update the regex yourself in your own copy of the workflow or you can rename your builds to replace periods with underscores or some other delimiter.