IQTREE error: Some sequences (see above) are problematic, please check your alignment again

jonr · April 10, 2024, 1:40pm

Hi,
My build stops at the iqtree stage with the message: “ERROR: Some sequences (see above) are problematic, please check your alignment again”

Job 3: Building tree
Reason: Missing output files: nextstrain_results/tree_raw.nwk; Input files updated by another job: nextstrain_results/aligned.fasta


        augur tree             --alignment nextstrain_results/aligned.fasta             --output nextstrain_results/tree_raw.nwk             --method iqtree             --override-default-args             --substitution-model auto             --nthreads 10             --tree-builder-args "-B 1000"
        
Building a tree via:
	iqtree -ntmax 10 -s nextstrain_results/aligned-delim.fasta -B 1000 > nextstrain_results/aligned-delim.iqtree.log
	Nguyen et al: IQ-TREE: A fast and effective stochastic algorithm for estimating maximum likelihood phylogenies.
	Mol. Biol. Evol., 32:268-274. https://doi.org/10.1093/molbev/msu300

Conducting a model test... see 'nextstrain_results/aligned-delim.iqtree.log' for the result. You can specify this with --substitution-model in future runs.

ERROR: Shell exited 2 when running: iqtree -ntmax 10 -s nextstrain_results/aligned-delim.fasta -B 1000 > nextstrain_results/aligned-delim.iqtree.log
Command output was:
  ERROR: Some sequences (see above) are problematic, please check your alignment again

ERROR: TREE BUILDING FAILED
ERROR: Command '['/bin/bash', '-c', 'set -euo pipefail; iqtree -ntmax 10 -s nextstrain_results/aligned-delim.fasta -B 1000 > nextstrain_results/aligned-delim.iqtree.log']' returned non-zero exit status 2.
Please see the log file for more details: nextstrain_results/aligned-delim.iqtree.log

Building original tree took 0.20879316329956055 seconds

There are many warnings the alignment step. For example:
WARNING: this insertion was caused due to 'N's or '?'s in provided sequences

But the sequences passed all filtering and processing steps earlier. Do I need to manually inspect the sequence, or how can I avoid this error?

Thanks

jlhudd · April 10, 2024, 4:41pm

Hi @jonr, one quick way to check your alignment for sequences with problematic characters is to run augur index --sequences nextstrain_results/aligned.fasta --output nextstrain_results/alignment_index.tsv. The augur index command produces a table of counts for standard nucleotide characters, other valid IUPAC characters, ambiguous characters (“-”), and other invalid characters. You can filter this table by those counts to find potentially problematic sequences. IQ-TREE will not accept sequences with invalid IUPAC characters, but it should handle the other ambiguous characters.

You can tell augur filter to exclude sequences with invalid characters with the --non-nucleotide flag. Using this flag requires you to provide your sequences as an input along with the metadata.

If you don’t see any issues with the number of invalid characters in your alignment, it would be helpful to visualize your alignment with a tool like AliView.

jonr · April 11, 2024, 8:17am

Thanks @jlhudd !
I inlcuded a bunch of sequences with either only N’s or mostly N’s. I thought these would be filtered out during augur filter and align, but they were still part of the aligned.fasta. Removing them fixed the problem.

Sometimes we get these sequences with only N’s because we create reference-based consensus sequences. But I can include some additional sanity checks before we start the Nextstrain build.

jlhudd · April 11, 2024, 10:08pm

@jonr I’m glad you found the issue! When you run augur filter, you can provide a minimum length per sequence with the --min-length argument which filters based on the number of A, C, G, and T characters in each sequence. For example, you could run the filter command with --min-length 1 to ensure that sequences of all Ns get dropped from your analysis.

Topic		Replies	Views
Error: tree building failed Help and Getting Started	4	1846	March 15, 2021
Iqtree error: Tree taxa and alignment sequence do not match	10	340	November 6, 2024
Tree building failed	2	918	February 21, 2022
Problem with iqtree	1	808	September 23, 2021
Is this the expected behaviour of augur tree? Help and Getting Started	4	388	May 10, 2023

IQTREE error: Some sequences (see above) are problematic, please check your alignment again

Related topics