Clade Labeling forces AA Mut-file although stated otherwise


We try to run a nextstrain bacterial pathogen workflow and struggle to annotate our clades.
Does “augur clades” really require aa mutations file? Or is it possible to label clades only with nuc mutations? In our question, aa mutations do not play a role. We just want to label our clades based on nuc SNPs.

We obey the file format which is described in the docs - gene=nuc.

augur clades --tree augur_out_tree_refined.nwk
–reference $REFERENCE_FNA
–clades clades.tsv
–mutations nt_muts.json
–output-node-data 06_augur_out_clades.json

Traceback (most recent call last):
File “/home/XXXX/apps/miniconda/envs/nextstrain/bin/augur”, line 10, in
File “/home/XXXX/apps/miniconda/envs/nextstrain/lib/python3.8/site-packages/augur/”, line 10, in main
return argv[1:] )
File “/home/XXXX/apps/miniconda/envs/nextstrain/lib/python3.8/site-packages/augur/”, line 75, in run
File “/home/XXXX/apps/miniconda/envs/nextstrain/lib/python3.8/site-packages/augur/”, line 208, in run
clade_membership = assign_clades(clade_designations, all_muts, tree, ref)
File “/home/XXXX/apps/miniconda/envs/nextstrain/lib/python3.8/site-packages/augur/”, line 120, in assign_clades
tree.root.sequences.update({gene:{} for gene in all_muts[][‘aa_muts’]})
KeyError: ‘aa_muts’


1 Like

Thanks for getting in touch. According to the documentation, current Augur (v22.2.0) shouldn’t require amino acid mutations. Nucs should be enough, according to the and/or in --mutations help.

The line numbers of your traceback (very helpful) suggest you’re running a rather old version of augur. Can you check what it is your augur --version? How have you installed augur?

Here’s current augur clades -h output:

$ augur clades -h
usage: augur clades [-h] --tree TREE --mutations NODE_DATA_JSON [NODE_DATA_JSON ...] --clades TSV [--output-node-data NODE_DATA_JSON]
                    [--membership-name MEMBERSHIP_NAME] [--label-name LABEL_NAME]

Assign clades to nodes in a tree based on amino-acid or nucleotide signatures. Nodes which are members of a clade are stored via
<OUTPUT_NODE_DATA> → nodes → <node_name> → clade_membership and if this file is used in `augur export v2` these will automatically become a
coloring. The basal nodes of each clade are also given a branch label which is stored via <OUTPUT_NODE_DATA> → branches → <node_name> →
labels → clade. The keys "clade_membership" and "clade" are customisable via command line arguments.

  -h, --help            show this help message and exit
  --tree TREE           prebuilt Newick -- no tree will be built if provided (default: None)
  --mutations NODE_DATA_JSON [NODE_DATA_JSON ...]
                        JSON(s) containing ancestral and tip nucleotide and/or amino-acid mutations (default: None)
  --clades TSV          TSV file containing clade definitions by amino-acid (default: None)
  --output-node-data NODE_DATA_JSON
                        name of JSON file to save clade assignments to (default: None)
  --membership-name MEMBERSHIP_NAME
                        Key to store clade membership under; use "None" to not export this (default: clade_membership)
  --label-name LABEL_NAME
                        Key to store clade labels under; use "None" to not export this (default: clade)

Here is the full documentation: augur clades — Augur 22.2.0 documentation

Thanks! It was an old frozen version of augur. Now it runs smoothly-

1 Like