How to interpret "ORF8:I121X" mutation annotation?

I just ran a collection of samples through the Nextclade web interface and one of the recurrent mutations is “ORF8:I121X”. I assume this isn’t a nonsense mutation, since those appear to be annotated like “ORF14:Q46*”. However, there is also an (empty) deletions columns, so I’m guessing the “X” doesn’t mean deleted. So what does it mean?

Thanks,
Alex

Hi @iskander , thanks for reaching out! * in the codon notation is a stop codon, not a “nonsense mutation”. Every possible combination of nucleotides in a codon codes either for an amino-acid or or a stop codon.
X is the character used to designate that the codon cannot be translated, but this is because there’s a gap which isn’t ‘in-frame’ (doesn’t line up with the 3-bases that make an amino-acid).
If the nucleotide gaps do line up, the AA will show - just like for nucleotides.

So the X here means there’s some kind of deletion in the nucleotides. NextClade doesn’t currently use amino-acid aware alignment, so it’s very possible the deletion is ‘in-frame’ but we haven’t aligned it right. You’d need to look back to the sequence at this position to figure out if that’s the case, or if it’s a deletion that can’t be aligned ‘in-frame’. However, we’re working hard on a better algorithm for this!

I hope this helps!

Hey Emma, thank you so much for you reply!

I’m guessing that “nonsense mutation” is a difference in terminology between viral vs. cancer/human genomics, but it’s the same idea (https://www.nature.com/scitable/definition/nonsense-mutation-228/), introduction of a new stop codon before the original annotated stop.

I tracked down the offending mutation and it’s a single nucleotide deletion at position 28254. This causes a peculiar frameshift: replacing the final amino acid of ORF8 with a short novel sequence: SKRTN. I posted a little more information in a github issue: Error in annotating frameshift at the end of ORF8 (previously: "X" in amino acid substitutions) · Issue #303 · nextstrain/nextclade · GitHub

Let me know if you want help implementing the effect annotation for frameshifts. For now, would it be possible to detect out-of-frame mutations and annotate them as “fs” instead of “X”?

Ah, apologies about misunderstanding the terminology! But glad the answer helped.

It would be better for @rneher or Ivan to answer about possible future functionality of NextClade. At the moment they are working on an algorithm which is codon-aware but I imagine single-nucleotide deletions would still be annotated X as is common - however, perhaps they can weigh in with more detail!

Yes, this is a frame shifting deletion at the end of ORF8. We are not currently dealing with frame shifting deletion and we rely on the reference annotation. Invalid codons are translated as X.