Hi, with only 200 sequences Monkeypox is getting pretty long to build (197000 bases and a lot of NNN)
So I am now removing from the alignment the columns such that all the sequences have the same base as the reference or a N.
- I am writing the correspondance between the coordinates in the original alignment and the new alignment in a .tsv
- Then I’m running iqtree, augur refine --timetree, augur ancestral,
- Then I’m parsing the obtained muts.json, rewriting the mutations in the correct coordinates, and rewriting similarly the ancestral sequences.
What do you think? What about adding to augur the commands to do it?
augur compress alignment --input alignment.fasta --output-alignment compressed.fasta --output-coordinates coords.tsv
augur compress renumber --input-mutations mutsCompressed.json --input-coordinates coords.tsv --output muts.json
Though it would need giving to augur refine --timetree the length of the original alignment, so it can scale the clockrate accordingly.