In the SARS-CoV-2 builds there are several positions in the alignment that are masked due to various problems. See here.
I am particularly wondering about position 21987, which relates to amino acid 142 in Spike. The position is listed as
amplicon_drop_or_primer_artefact,back_to_ref which means:
## amplicon_drop_or_primer_artefact = Amplicon dropout and/or failed primer trimming ## back_to_ref = The alternate allele is not called for this position due to issues with amplicon dropout and primer trimming. For more details, see: https://github.com/W-L/ProblematicSites_SARS-CoV2/issues/7 and https://github.com/cov-lineages/pango-designation/issues/95
Does anyone know if this information is still valid? In our data we see the substitusion G142D in sequences across different sequencing platforms and protocols. However it migh be overrepresented in the SWIFT-protocol, which could support some primer-related issues. But looking at outbreak.info, G142D is very common: outbreak.info