I have a sample that according to NextClade should be in 20H/501Y.V2 that when I produce the tree for all my samples is placed into 20C. I have manually verified the clade assignment based on the BAM file for the sample, and when I color the data by nucleotide and put in the two locations that separate 20H/501Y.V2 from 20C, both are present.
I have verified that the clades.tsv file that I’m using is up to date, and I’m struggling to think what else I could be missing that’s causing this error. Samples in 20I/501Y.V1 are all correctly identified, so I know that it’s at least able to correctly identify when there is an S:N501Y mutation.
Any suggestion as to where I’m going wrong with this is greatly appreciated!
Thanks in advance,
Edit: I have solved the problem in what feels to be a somewhat hacky way, by adding another sample that is in 20H/501Y.V2. This seems to encourage the algorithm to recognize that our sample is indeed a variant.
However, I feel that there likely is another solution that would allow me to have only samples from my area, but still get the correct clades. If anyone has an idea as to where I’m going wrong I would appreciate knowing.