Hi!
I have just finalized a build on ~1750 genomes that we have collected so far in Hawaii.
Lately, the B.1.429 lineage has been the dominating variant detected here.
For this particular build, I looked at the interesting logistic_growth value, calculated as shown in this
fascinating discussion between mostly Trevor @trvrb and John @jlhudd https://github.com/nextstrain/ncov/pull/595 .
I need more time to understand the subtleties of how this calculated value reflects or not a lineage that is more transmissible or fit to expand in a certain context (with limitations caused by non-uniform or insufficient sampling), but I would be very grateful if you could look below and tell me whether do you think this is a red herring or a true sub-lineage of interest.
The B.1.429 lineage seems to be split into a few sub-clades that are either below or above 0 in terms of logistic_growth:
What seems to differentiate the clades that are above from the ones that are below 0 is a mutation in the nucleocapsid protein - N:M234I (M is blue in the screenshot above and I is yellow)
If you look at the time-resolved tree of these samples colored by logistic_growth, the clades below the N:M234I mutation are all orange-red, showing the same idea:
Now, if I look at the latest North America Nextstrain build (which has though only 108 B.1.429 sequences) in the same way:
https://nextstrain.org/ncov/north-america?branchLabel=aa&c=gt-N_234&d=tree,map,frequencies&f_pango_lineage=B.1.429&l=scatter&m=div&p=grid&scatterY=logistic_growth
it looks somewhat similar, since there are two subclades of B.1.429, one with a slightly positive logistic_growth value (~0.38) and one with a slightly negative value (~ -0.05).
Unfortunately, both these values fall into the same color bin if you color by logistic_growth, but you can still see the positive or negative value if you hover over the tips:
https://nextstrain.org/ncov/north-america?branchLabel=aa&c=logistic_growth&d=tree,map,frequencies&f_pango_lineage=B.1.429&m=div&p=grid
What is different is that there are some subclades that are not under N:M234I that still have a positive value for logistic_growth. In fact, everything that is under C12100T and C8947T seem to have a positive value (although in our tree, all the samples have both C12100T and C8947T, but they are still separated in subclades with positive and negative logistic_growth values).
The Andersen lab has a community Nextstrain build that has 517 B.1.429’s, but unfortunately they don’t do their builds calculating logistic_growth at least not yet:
https://nextstrain.org/community/andersen-lab/HCoV-19-Genomics-Nextstrain/hCoV-19/usa/sandiego?branchLabel=aa&c=gt-N_234&d=tree,frequencies&f_pangolin_lineage=B.1.429&p=full
So my question(s) are:
- can we assert that we see some subclades that have increased transmissibility, or is it just sampling noise?
- can we link the higher logistic_growth value to a particular mutation or not (it seems not, by looking at the global build data)?
Your insight would be highly appreciated.
Thank you,
Razvan