Hello! Non-SARS-CoV-2 post - let’s talk about syphilis.
I’m using augur translate to lay synonymous mutations onto my whole genome phylogeny of 242 T pallidum pallidum genomes, most of which I’ve recently assembled and are not public yet. It’s working brilliantly, and the visualization in auspice is fantastic. However, I’d really like to be able to get under the hood a bit more and extract all of the nt and AA mutation per branch. I know it is all in the relevant JSON files and I can manually-ish parse it after removing some header information to allow reading into R, but was hoping you all might have some better tools and tips for how to get the information - I’m hopeful I’m just missing an optional argument to include the information in a nexus tree or something! Whatever script is used to project the mutation data interactively onto the branches might be helpful?
Also, I am pretty good in R but dreadful in python, so I may just be displaying my lack of knowledge of existing pipelines that would make this simple… apologies if that is the case!
Just for scale, we are talking on the order of a few thousand nt mutations along the different branches, and several hundred at the AA level.
Happy to share any JSON files or scripts that would be helpful.
Thanks a million!