Hi Nextstrain Team,
I am trying to run the ncov pipeline with my own reference and custom data which was downloaded from GISAID in the Augur format. I have previously run this workflow before when there were different yamls for the different types of builds. Here, I am using the default parameters.yaml and have adjusted it accordingly.
However, I cannot seem to get past this error:
[Mon Aug 4 16:48:34 2025]
Job 11:
Aligning sequences to defaults/reference_seq.fasta
- gaps relative to reference are considered real
Reason: Missing output files: results/aligned_custom_data.fasta.xz
Shell command:
python3 scripts/sanitize_sequences.py \
--sequences data/ba3_sequences.fasta \
--strip-prefixes hCoV-19/ SARS-CoV-2/ \
--output /dev/stdout 2> logs/sanitize_sequences_custom_data.txt \
| nextclade run \
--jobs=8 \
--input-ref defaults/reference_seq.fasta \
--input-annotation defaults/annotation.gff \
--output-fasta results/aligned_custom_data.fasta > logs/align_custom_data.txt 2>&1;
xz -2 -T 8 results/aligned_custom_data.fasta;
/Users/nikitasitharam/.nextstrain/runtimes/conda/env/bin/bash: line 11: xz: command not found
RuleException:
CalledProcessError in file "/Users/nikitasitharam/ncov/workflow/snakemake_rules/main_workflow.smk", line 90:
Command 'set -euo pipefail;
python3 scripts/sanitize_sequences.py \
--sequences data/ba3_sequences.fasta \
--strip-prefixes hCoV-19/ SARS-CoV-2/ \
--output /dev/stdout 2> logs/sanitize_sequences_custom_data.txt \
| nextclade run \
--jobs=8 \
--input-ref defaults/reference_seq.fasta \
--input-annotation defaults/annotation.gff \
--output-fasta results/aligned_custom_data.fasta > logs/align_custom_data.txt 2>&1;
xz -2 -T 8 results/aligned_custom_data.fasta;' returned non-zero exit status 127.
These are the versions I am using in this conda env - augur 31.3.0 and nextstrain.cli 8.5.3. I have also just recently updated the runtime.
xz is also installed in this environment -
xz (XZ Utils) 5.8.
liblzma 5.8.1
I am not sure why the main_workflow.smk is not picking this up.
Kind regards,
Nikita