Pangolin AY.* designations

enabieva · October 8, 2021, 8:15am

Hi, I apologize for asking about a topic that is only marginally related to Nextstrain, but this is the only forum on SARS-CoV-2 phylogenomics that I’m aware of (please point me to more appropriate places if such exist).

Could anyone please explain how Pangolin designations work? In particular, for the AY.* sublineages of Delta, there is a list of defining mutations (New AY lineages and an update to AY.4-AY.12 – Pango Network), yet actual designations do not always correspond well to these designations (e.g., Possibly wrong assignments of AY.4 · Issue #221 · cov-lineages/pango-designation · GitHub). How does that happen? Is that once a novel clade, defined by certain mutation(s), is discovered, the Pango decision tree is trained on that clade and may pick up on artefacts that happen to also be in those sequences but that are not related to the defining mutations?
It seems that the Pango team eventually corrects these mistakes (a couple of weeks ago, AY.12 seemed to be dominating a lot of the data, but that seems to have been an artefact that later got fixed), but it takes a while.

Does anyone have experience dealing with these situations where GISAID seems to be assigning lineages too broadly (e.g., for the purposes of tracking the growth of different sublineages of Delta)? What do you do? Do you trust the current Pangolin assignments? Do you have custom filters?

Topic		Replies	Views
Correspondance of SARS-CoV-2 annotations (Nextclade - Pangolin) General	2	870	December 8, 2021
Map NextStrain names to others? General	2	1069	February 5, 2021
Pango lineage (Nextclade) General	2	618	May 2, 2022
Trouble identifying mutations in clade definitions (20J/501Y.V3) General	2	519	February 28, 2021
Does NextStrain recognize emerging clade 20B/S.484K	0	628	February 2, 2021

Pangolin AY.* designations

Related topics