New version rule align error

Hello! I used a previous version of Nextstrain a year or two ago – now, I am unable to run the same files. I am getting the same error as I do when I follow the ncov tutorial. I’ve tried changing the version to V10 and V11 and I have a rule align issue as well, just a slightly different error message. I am running: nextstrain build . --configfile my_profiles/builds_rep1.yaml. I’ve changed this line a few times and the reference links. Have been troubleshooting for half the day so far.

Thank you so much for your help!
Gabriella

My yaml is below:
inputs:

builds:
georgia:
subsampling_scheme: georgia_scheme
country: USA
division: Georgia
auspice_config: ncov-tutorial/auspice-config-custom-data.json

subsampling:
georgia_scheme:
focal:
query: --query “location != ‘OOS’”
contextual:
group_by: “region year week”
max_sequences: 5000 #needs group_by to work
query: --query “location == ‘OOS’”
priorities:
type: “proximity”
focus: “focal”

traits:
georgia: ###build name
sampling_bias_correction: 2.5
columns: [“division”] ###traits to reconstruct, must match column names in metadata

My error is here:
Error in rule align:
jobid: 12
input: data/contextual_sequences.fasta, defaults/annotation.gff, defaults/reference_seq.fasta
output: results/aligned_worldwide.fasta.xz
log: logs/align_worldwide.txt (check log file(s) for error details)
conda-env: /home/gev25289/ncov/.snakemake/conda/22ec1a01f6d53102475e14a528cf67cd_
shell:

    python3 scripts/sanitize_sequences.py \
        --sequences data/contextual_sequences.fasta \
        --strip-prefixes hCoV-19/ SARS-CoV-2/ \
        --output /dev/stdout 2> logs/sanitize_sequences_worldwide.txt \
        | nextclade run \
        --jobs=8 \
        --input-ref defaults/reference_seq.fasta \
        --input-annotation defaults/annotation.gff \
        --output-fasta results/aligned_worldwide.fasta > logs/align_worldwide.txt 2>&1;
    xz -2 -T 8 results/aligned_worldwide.fasta;
    
    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

My error in the logs file is here:
error: Found argument ‘–input-annotation’ which wasn’t expected, or isn’t valid in this context

Did you mean '--input-dataset'?

If you tried to supply `--input-annotation` as a value rather than a flag, use `-- --input-annotation`

USAGE:
nextclade run --jobs --input-ref <INPUT_REF> --input-dataset <INPUT_DATASET>

For more information try --help

Hi Gabriella @GVeytsel

The error seems to be coming from Nextclade, which is used inside some of the pipelines in Nextstrain organization. It appears that Nextclade is not being able to recognize the --input-annotation argument. This could mean that the version of Nextclade is too old - the argument was introduced about a year ago in version 3.

I hypothesize that you might have updated the pipeline, e.g. nextstrain/ncov, but forgot to also upgrade nextstrain/cli and/or its runtime (e.g. conda, docker, or ambient), meaning that the old dependencies are used. Nextclade might be the one that manifests it first, but there could be more. Please refer to the latest documentation of the project(s) you are using and documentation of Nextstrain CLI to learn how to update (or reinstall).

To give an example, here are the dependencies of nextstrain/ncov pipeline: ncov/workflow/envs/nextstrain.yaml at 20f5fc3c7032f4575a99745cee3238ecbeebb6e0 · nextstrain/ncov · GitHub

It could also be (especially if you are using “ambient” runtime for nextstrain/cli, i.e. manually managed dependencies) that you have multiple versions of Nextclade and the wrong one is picked up. In any case, Nextclade v2 is deprecated and is no longer recommended to use, so if you also use Nextclade as a standalone tool, it’s worth updating it too. You can find out version of your currently installed Nextclade by running nextclade --version.

Thank you so much for the quick reply! It was helpful to understand that it was a version issue. After removing and updating everything, things are back to running.