I’m curious what others think about the deletions that are regularly showing up in sequenced 2022 monkeypox genomes. As I understand it, Nextstrain doesn’t really look at these, since it’s built on the covid framework where large deletions were assumed to be sequencing errors more often actual deletions.
I noticed that everything our lab sequenced here in the midwest USA had two deletions: a gap between 133164-133173, and a gap between 150551-150567. Looking at the BED files for our consensus sequences, this seems to be an actual deletion, not a sequencing artifact. I also noted that a few of the sequences I downloaded from GISAID had similar deletions.
Since Nextstrain doesn’t display any data on these gaps by default, I threw the entire nextstrain fasta into the command-line version of nextclade and had it spit out the deletions for all strains. Then I edited the nextstrain metadata file to have new columns for gaps at 133164-133173, 150551-150567, and ones that started at 179077 (there seems to be some different gaps here. The majority were 179077-179148, while some ended at 179094 or 179175. I colored these differently.)
When I looked at my phylo trees for these traits, some comments:
- the 133164 gap seems to be in nearly every B.1 sample. There’s only a handful of exceptions, which could be due to different consensus-building program setitngs labeling those gaps at Ns instead of gaps. This deletion, unlike the other two, seems like a “true” deletion that was unused when the B.1 nextclade model was built.
- the 150551 and 179077 gaps were present in only a fraction of the samples, but do not follow any pattern that I can see in the phylo tree. If these are real deletions, it would seem like they’d follow some pattern, showing up at the same time as certain mutations. If they’re real deletions, it would probably need to be taken into account on tree building, right? The randomness makes me think these are somehow sequencing or consensus building artifacts.
I’m curious what other people think about these, or other common deletions I haven’t spotted yet.