Hello all,
Firstly, thanks for the contribution to the field with Nextstrain and all the associated resources!
We would like to use Nextstrain resources in our project, and I have been digging into your resources for a while.
However, after spending quite a lot of hours (please excuse me if my words are conceptually not so correct as I’m a newbie in the virus world), I do not seem to understand how I can create a resource that contains all the mutations that each Nextstrain clade contains for SARS-CoV-2. Here I must state that I am not looking for defining mutations like Nextstrain uses for augur clades
to define clades on the tree: ncov/defaults/clades.tsv at master · nextstrain/ncov · GitHub.
What I am looking for is a set of all the mutations that each clade contains.
In Nextclade CLI, I came across tree.json
for SARS-CoV-2 among dataset files. It seems to contain all clades with nucleotide changes but I have realized it’s not inclusive enough when I compared individual clades with the mutations shown in covariants.org plus it’s also not the same with what’s offered with clades.tsv
.
I have looked up a lot and I did not find a resource that I could extract all mutations for each clade.
I would be very glad if anybody could give me any ideas or some insights on my problem.
Thank you in advance!