Regarding Error with nextstrain - Memory Problems

@vrmarathe Yes, in this case, it looks like you are trying to build a tree with all 3+ million SARS-CoV-2 genomes. IQ-TREE’s memory requirements appear to scale with the size of the multiple sequence alignment input which is now hundreds of gigabytes of uncompressed data. In addition to the memory requirements, the time required to infer a phylogeny of this size with IQ-TREE would be much longer than you’d want to wait.

If you are interested in very large tree building efforts, check out efforts from Russ Corbett et al. at UCSC like this one that uses Taxonium and USHER tools to view millions of samples in a single tree.

2 Likes