Augur alignment failing - problem with mafft

The error you’re seeing is almost certainly because MAFFT ran out of memory. The mafft: line 2747: 35443 Killed line is the clue.

I may have been mistaken when I said that 64GB of RAM was enough for the full GISAID dataset (~424k seqs in your case above). Looking at the benchmarks for our production builds:

image

I see that while the 7 instances of the “tree” and “refine” steps sum to less than 64 GB of RAM, the shared “align” step uses more than 120 GB all by itself.

You could further subset the data before aligning to fit within 64 GB, or possibly adjust the memory strategy MAFFT uses. augur align uses MAFFT’s --nomemsave option, which significantly increases memory requirements but makes the alignment much faster. Disabling this might help you fit the alignment in 64 GB. We also have another aligner in the works which I believe would reduce memory reqs, but not sure when that’ll be ready.

1 Like