Error: Alignment must have at least 3 sequences

Hello,

It’s been awhile since I’ve run the local build of nextstrain on my computer, so I went ahead and updated everything, first from github, then from the CLI using “conda update --all”

Ever since the update, I have not been able to successfully run the analysis, looks like a lot has changed, and I was wondering if someone could help me nail down where this issue is coming from.

I am not sure why it is asking for 3 sequences in the alignment. Is it saying that none of my sequences passed the quality checks, and now only the two reference sequences (Wuhan/Hu-1/2019 and Wuhan/WHO1/2019) remain?

If it helps, those two reference sequences are the only sequences in my aligned-delim.fasta file in the results folder. Almost all of the sequences I have passed to nextstrain have passed the quality checks in the past. I’m not sure where to look next.

Thanks for your help,
Jonathan

Hi! I’ve been having the same issue too. Have you been able to pin down the problem? Thanks!

@augustii Have a look at the logs of the rule subsample: ncov/main_workflow.smk at c9c48391bc911f4d4b262e13618ea73ca788e223 · nextstrain/ncov · GitHub

This is one of the places where sequences may disappear.

The path to the logs looks like this: logs/subsample_{build_name}_{subsample}.txt

In the Snakemake workflow that I linked to above you can see the output files from that rule. It’s worth checking them and the rules that consume these output files.

It’s hard to debug something like this from afar but if you share logs and input output files etc we can try to get there and maybe help others who have a similar issue.

I’d also be curious whether @jbarnell has figured it out in the meantime - although it’s been quite a while, sorry for that.

Same issue, cloned nextrain and ncov yesterday, running native.

Not sure why near all the example data is being dropped.

Job 10: 
        Combine and deduplicate FASTAs
        
Reason: Input files updated by another job: results/default-build/sample-all.txt


        augur filter             --sequences results/aligned_reference_data.fasta.xz             --metadata results/sanitized_metadata_reference_data.tsv.xz             --exclude-all             --include results/default-build/sample-all.txt             --output-sequences results/default-build/default-build_subsampled_sequences.fasta.xz             --output-metadata results/default-build/default-build_subsampled_metadata.tsv.xz 2>&1 | tee logs/subsample_regions_default-build.txt
        
488 strains were dropped during filtering
	240 had no metadata
	250 of these were dropped by `--exclude-all`
	250 strains were added back because they were in results/default-build/sample-all.txt
2 strains passed all filters

Note: You did not provide a sequence index, so Augur will generate one. You can generate your own index ahead of time with `augur index` and pass it with `augur filter --sequence-index`.
0 strains were dropped during filtering
2 strains passed all filters
[Wed Aug 31 10:21:38 2022]
Finished job 7.
10 of 30 steps (33%) done
Select jobs to execute...

[Wed Aug 31 10:21:38 2022]
Job 6: Building tree
Reason: Missing output files: results/default-build/tree_raw.nwk; Input files updated by another job: results/default-build/filtered.fasta

Same error

Error in rule tree:
    jobid: 6
    output: results/default-build/tree_raw.nwk
    log: logs/tree_default-build.txt (check log file(s) for error message)
    conda-env: path-to/ncov/.snakemake/conda/606fba2748c6c88ce497ee03a13af39a_
    shell:
        
        augur tree             --alignment results/default-build/filtered.fasta             --tree-builder-args '-ninit 10 -n 4'             --exclude-sites defaults/sites_ignored_for_tree_topology.txt             --output results/default-build/tree_raw.nwk             --nthreads 8 2>&1 | tee logs/tree_default-build.txt
        
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Result is >

ncov/results/default-build/
aligned.fasta:

Wuhan-Hu-1/2019
Wuhan/WH01/2019

Testing different includes >

defaults/include.txt
original:
Wuhan/Hu-1/2019
Wuhan/Hu-1/2019 #why 2X?
Wuhan/WH01/2019

Changed to:
Wuhan/Hu-1/2019
Wuhan/WH01/2019
Wuhan/WH04/2019

and to:
Wuhan/Hu-1/2019

Same result